Issue computing difference between two csv files

2024/10/13 9:18:07

I'm trying to obtain the difference between two csv files A.csv and B.csv in order to obtain new rows added in the second file. A.csv has the following data.

acct    ABC     88888888    99999999    ABC-GHD 4/1/18  4   1   2018    Redundant/RSK

B.csv has the following data.

acct    ABC     88888888    99999999    ABC-GHD 4/1/18  4   1   2018    Redundant/RSK
acct    ABC     88888888    99999999    ABC-GHD 4/1/18  4   1   2018    DT/89

To write the new rows added into an output file I'm using the following script.

input_file1 = "A.csv"
input_file2 = "B.csv"
output_path = "out.csv"with open(input_file1, 'r') as t1:fileone = set(t1)
with open(input_file2, 'r') as t2, open(output_path, 'w') as outFile:for line in t2:if line not in fileone:outFile.write(line)

Expected output is :

acct    ABC     88888888    99999999    ABC-GHD 4/1/18  4   1   2018    DT/89 

Output obtained through the above script is :

acct    ABC     88888888    99999999    ABC-GHD 4/1/18  4   1   2018    Redundant/RSK
acct    ABC     88888888    99999999    ABC-GHD 4/1/18  4   1   2018    DT/89

I'm not sure where I'm making a mistake, tried debugging it but with no progress.

Answer

You need to be careful with trailing newlines. As such it is safer to remove the newlines before comparing and then add them back when writing:

input_file1 = "A.csv"
input_file2 = "B.csv"
output_path = "out.csv"with open(input_file1, 'r') as t1:fileone = set(t1.read().splitlines())with open(input_file2, 'r') as t2, open(output_path, 'w') as outFile:for line in t2:line = line.strip()if line not in fileone:outFile.write(line + '\n')
https://en.xdnf.cn/q/118100.html

Related Q&A

How do I display an extremly long image in Tkinter? (how to get around canvas max limit)

Ive tried multiple ways of displaying large images with tkinterreally long image No matter what Ive tried, there doesnt seem to be any code that works. The main issue is that Canvas has a maximum heigh…

NoneType has no attribute IF-ELSE solution

Im parsing an HTML file and searching for status of order in it. Sometimes, status doesnt exist, so BeautifulSoup returns NoneType, when Im using it. To solve this problem I use if-else statement, but …

looking for an inverted heap in python

Id like to comb off the n largest extremes from a timeseries. heapq works perfectly for the nlargestdef nlargest(series, n):count = 0heap = []for e in series:if count < n:count+=1hp.heappush(heap, e…

Concatenating Multiple DataFrames with Non-Standard Columns

Is there a good way to concatenate a list of DataFrames where the columns are not regular between DataFrames? The desired outcome is to match up all columns that are a match but to keep the ones that …

Python Conditionally Add Class to td Tags in HTML Table

I have some data in the form of a csv file that Im reading into Python and converting to an HTML table using Pandas.Heres some example data:name threshold col1 col2 col3 A 10 12 9 13…

Sort a dictionary of dictionaries python

I have a dictionary of dictionaries like the followingd = {hain: {facet: 1, wrapp: 1, chinoiserie: 1}, library: {sconc: 1, floor: 1, wall: 2, lamp: 6, desk: 1, table: 1, maine: 1} }So, I want to rever…

How can I get python generated excel document to correctly calculate array formulas

I am generating some excel files with python using python 3.6 and openpyxl.At one point I have to calculate standard deviations of a subsection of data. In excel this is done with an array formula. Wri…

Unable to locate element in Python Selenium

Im trying to locate an element using python selenium, and have the following code:zframe = driver.find_element_by_xpath("/html/frameset/frameset/frame[5]") driver.switch_to.frame(zframe) find…

How to import a variable from a different class

I have an instance of a class that i set the value to self.world inside a class named zeus inside a module named Greek_gods. and i have another class names World inside a module name World.How can i te…

Scrapy: AttributeError: YourCrawler object has no attribute parse_following_urls

I am writing a scrapy spider. I have been reading this question: Scrapy: scraping a list of links, and I can make it recognise the urls in a listpage, but I cant make it go inside the urls and save the…