Recursively compare two directories to ensure they have the same files and subdirectories

2024/11/19 9:43:38

From what I observe filecmp.dircmp is recursive, but inadequate for my needs, at least in py2. I want to compare two directories and all their contained files. Does this exist, or do I need to build (using os.walk, for example). I prefer pre-built, where someone else has already done the unit-testing :)

The actual 'comparison' can be sloppy (ignore permissions, for example), if that helps.

I would like something boolean, and report_full_closure is a printed report. It also only goes down common subdirs. AFIAC, if they have anything in the left or right dir only those are different dirs. I build this using os.walk instead.

Answer

Here's an alternative implementation of the comparison function with filecmp module. It uses a recursion instead of os.walk, so it is a little simpler. However, it does not recurse simply by using common_dirs and subdirs attributes since in that case we would be implicitly using the default "shallow" implementation of files comparison, which is probably not what you want. In the implementation below, when comparing files with the same name, we're always comparing only their contents.

import filecmp
import os.pathdef are_dir_trees_equal(dir1, dir2):"""Compare two directories recursively. Files in each directory areassumed to be equal if their names and contents are equal.@param dir1: First directory path@param dir2: Second directory path@return: True if the directory trees are the same and there were no errors while accessing the directories or files, False otherwise."""dirs_cmp = filecmp.dircmp(dir1, dir2)if len(dirs_cmp.left_only)>0 or len(dirs_cmp.right_only)>0 or \len(dirs_cmp.funny_files)>0:return False(_, mismatch, errors) =  filecmp.cmpfiles(dir1, dir2, dirs_cmp.common_files, shallow=False)if len(mismatch)>0 or len(errors)>0:return Falsefor common_dir in dirs_cmp.common_dirs:new_dir1 = os.path.join(dir1, common_dir)new_dir2 = os.path.join(dir2, common_dir)if not are_dir_trees_equal(new_dir1, new_dir2):return Falsereturn True
https://en.xdnf.cn/q/26436.html

Related Q&A

Specific reasons to favor pip vs. conda when installing Python packages

I use miniconda as my default python installation. What is the current (2019) wisdom regarding when to install something with conda vs. pip?My usual behavior is to install everything with pip, and onl…

Insert a link inside a Pandas table

Id like to insert a link (to a web page) inside a Pandas table, so when it is displayed in an IPython notebook, I could press the link. I tried the following: In [1]: import pandas as pdIn [2]: df = pd…

TypeError: string indices must be integers while parsing JSON using Python?

I am confuse now why I am not able to parse this JSON string. Similar code works fine on other JSON string but not on this one - I am trying to parse JSON String and extract script from the JSON.Below …

Python dynamic inheritance: How to choose base class upon instance creation?

IntroductionI have encountered an interesting case in my programming job that requires me to implement a mechanism of dynamic class inheritance in python. What I mean when using the term "dynamic …

Python: efficiently check if integer is within *many* ranges

I am working on a postage application which is required to check an integer postcode against a number of postcode ranges, and return a different code based on which range the postcode matches against.E…

How to pass on argparse argument to function as kwargs?

I have a class defined as followsclass M(object):def __init__(self, **kwargs):...do_somethingand I have the result of argparse.parse_args(), for example:> args = parse_args() > print args Namespa…

How do you check whether a python method is bound or not?

Given a reference to a method, is there a way to check whether the method is bound to an object or not? Can you also access the instance that its bound to?

Most Pythonic way to declare an abstract class property

Assume youre writing an abstract class and one or more of its non-abstract class methods require the concrete class to have a specific class attribute; e.g., if instances of each concrete class can be …

PyQt on Android

Im working on PyQt now, and I have to create the application on Android, Ive seen the kivy library, but its too crude.Is there any way now to run an application on Android made on PyQt?

How to elementwise-multiply a scipy.sparse matrix by a broadcasted dense 1d array?

Suppose I have a 2d sparse array. In my real usecase both the number of rows and columns are much bigger (say 20000 and 50000) hence it cannot fit in memory when a dense representation is used:>>…