Given two text files A,B, what is an easy way to get the line numbers of lines in B not present in A? I see there's difflib, but don't see an interface for retrieving line numbers
Given two text files A,B, what is an easy way to get the line numbers of lines in B not present in A? I see there's difflib, but don't see an interface for retrieving line numbers
difflib can give you what you need. Assume:
a.txt
this
is
a
bunch
of
lines
b.txt
this
is
a
different
bunch
of
other
lines
code like this:
import difflibfileA = open("a.txt", "rt").readlines()
fileB = open("b.txt", "rt").readlines()d = difflib.Differ()
diffs = d.compare(fileA, fileB)
lineNum = 0for line in diffs:# split off the codecode = line[:2]# if the line is in both files or just b, increment the line number.if code in (" ", "+ "):lineNum += 1# if this line is only in b, print the line number and the text on the lineif code == "+ ":print "%d: %s" % (lineNum, line[2:].strip())
gives output like:
bgporter@varese ~/temp:python diffy.py
4: different
7: other
You'll also want to look at the difflib code "? "
and see how you want to handle that one.
(also, in real code you'd want to use context managers to make sure the files get closed, etc etc etc)