Concatenating Multiple DataFrames with Non-Standard Columns

2024/10/13 9:16:06

Is there a good way to concatenate a list of DataFrames where the columns are not regular between DataFrames?

The desired outcome is to match up all columns that are a match but to keep the ones that have no match off to the side. The reason you would want to keep the unmatched columns is because while there may not be a match on a given column between the 1st and 2nd dataframes in the list there may be a match between the 1st and 3rd. Thus discarding prematurely on the first lack of match would not be ideal.

And example is:

print list(datalist[0].columns)
>>>[u'1', u'2', u'3']print list(datalist[1].columns)
>>>[u'1', u'2', u'4']print list(datalist[2].columns)
>>>[u'2', u'3', u'4']

Where the output would be a dataframe like (stylistically represented here):

1 2 3 - 
1 2 - 4
- 2 3 4
Answer
data=pd.concat(datalist,join='outer', axis=0, ignore_index=True)

This works. I was originally under the impression that concat with the join="outer" argument applied would just append straight up and down without regard to column names. Actually, when the join="outer" argument is applied it will combine what matching columns it can but then keep all of the non-matched columns off to the side of the DF, which is exactly what is desired. Hope this helps someone else.

https://en.xdnf.cn/q/118096.html

Related Q&A

Python Conditionally Add Class to td Tags in HTML Table

I have some data in the form of a csv file that Im reading into Python and converting to an HTML table using Pandas.Heres some example data:name threshold col1 col2 col3 A 10 12 9 13…

Sort a dictionary of dictionaries python

I have a dictionary of dictionaries like the followingd = {hain: {facet: 1, wrapp: 1, chinoiserie: 1}, library: {sconc: 1, floor: 1, wall: 2, lamp: 6, desk: 1, table: 1, maine: 1} }So, I want to rever…

How can I get python generated excel document to correctly calculate array formulas

I am generating some excel files with python using python 3.6 and openpyxl.At one point I have to calculate standard deviations of a subsection of data. In excel this is done with an array formula. Wri…

Unable to locate element in Python Selenium

Im trying to locate an element using python selenium, and have the following code:zframe = driver.find_element_by_xpath("/html/frameset/frameset/frame[5]") driver.switch_to.frame(zframe) find…

How to import a variable from a different class

I have an instance of a class that i set the value to self.world inside a class named zeus inside a module named Greek_gods. and i have another class names World inside a module name World.How can i te…

Scrapy: AttributeError: YourCrawler object has no attribute parse_following_urls

I am writing a scrapy spider. I have been reading this question: Scrapy: scraping a list of links, and I can make it recognise the urls in a listpage, but I cant make it go inside the urls and save the…

initializer is not a constant, error C2099, on compiling a module written in c for python

i tried to compile a python module called distance, whith c "python setup.py install --with-c" using msvc 2017 on windows 10, i got this error ,Cdistance / distance.c (647): error C2099: init…

How can make pandas columns compare check cell?

I have a two file. a.txt has the below data.Zone,Aliase1,Aliase2 VNX7600SPB3_8B3_H1,VNX7600SPB3,8B3_H1 VNX7600SPBA_8B4_H1,VNX7600SPA3,8B4_H1 CX480SPA1_11B3_H1,CX480SPA1,11B3_H1 CX480SPB1_11B4_H1,CX480S…

Flask argument of type _RequestGlobals is not iterable

When I tried to use Flask-WTForms, I followed these steps:from flask_wtf import Form from wtforms import StringField, PasswordField from wtforms.validators import DataRequired, Emailclass EmailPassword…

PumpStreamHandler can capture the process output in realtime

I try to capture a python process output via apache-commons-exec. But it looks like it wont print the output, the output is only displayed after I the python process is finished.Heres my java codeComma…