Pandas: Unstacking One Column of a DataFrame

2024/10/1 19:37:48

I want to unstack one column in my Pandas DataFrame. The DataFrame is indexed by the 'Date' and I want to unstack the 'Country' column so each Country is its own column. The current pandas DF looks like this:

             Country   Product      Flow Unit  Quantity  
Date                                                         
2002-01-31   FINLAND  KEROSENE  TOTEXPSB  KBD    3.8129     
2002-01-31    TURKEY  KEROSENE  TOTEXPSB  KBD    0.2542     
2002-01-31  AUSTRALI  KEROSENE  TOTEXPSB  KBD   12.2787     
2002-01-31    CANADA  KEROSENE  TOTEXPSB  KBD    5.1161     
2002-01-31        UK  KEROSENE  TOTEXPSB  KBD   12.2013     

When I use df.pivot I get the following error "ReshapeError: Index contains duplicate entries, cannot reshape" This is true since I'm looking at a Dates that are reported at the same time by each country. What I would like is to unstack the 'Country Column so only one Date would show for each month.

the DataFrame headers like this Date would still be the index:

Date        FINLAND TURKEY  AUSTRALI  CANADA Flow      Unit2002-01-31  3.8129  0.2542  12.2787   5.1161 TOTEXPSB   KBD

I have worked on this for a while and I'm not getting anywhere so any direction or insight would be great.

Also, note you are only seeing the head of the DataFrame so years of Data is in this format.

Thanks,

Douglas

Answer

If you can drop Product, Unit, and Flow then it should be as easy as

df.reset_index().pivot(columns='Country', index='Date', values='Quantity')

to give

Country  AUSTRALI    CANADA  FINLAND TURKEY  UK
Date                    
2002-01-31   12.2787     5.1161  3.8129  0.2542  12.2013
https://en.xdnf.cn/q/70929.html

Related Q&A

python-polars split string column into many columns by delimiter

In pandas, the following code will split the string from col1 into many columns. is there a way to do this in polars? d = {col1: ["a/b/c/d", "a/b/c/d"]} df= pd.DataFrame(data=d) df…

pylint giving not-callable error for object property that is callable

Not sure if I am doing something wrong or if this is a problem with pylint. In the code below I get a linting error that self.type is not callable E1102.Although I could just ignore it and keep workin…

ModuleNotFoundError: No module named api

I created a Django project inside of api folder called bucks:api |____ categories/|____ __init__.py|____ ...|____ models.py|____ tests.py|____ views.py |____ .../ |____ bucks/ |____ users/|____ __init_…

Reading csv header white space and case insensitive

Is there a possibility to read the header of a CSV file white space and case insensitive? As for now I use csv.dictreader like this:import csv csvDict = csv.DictReader(open(csv-file.csv, rU))# determi…

How to remove the seconds of Pandas dataframe index?

Given a dataframe with time series that looks like this:Close 2015-02-20 14:00:00 1200.1 2015-02-20 14:10:00 1199.8 2015-02-21 14:00:00 1199.3 2015-02-21 14:10:00 1199.0 2015-02-22 14:00:00 1198.4…

Slow loading SQL Server table into pandas DataFrame

Pandas gets ridiculously slow when loading more than 10 million records from a SQL Server DB using pyodbc and mainly the function pandas.read_sql(query,pyodbc_conn). The following code takes up to 40-4…

compress a string in python 3?

I dont understand in 2.X it worked :import zlib zlib.compress(Hello, world)now i have a :zlib.compress("Hello world!") TypeError: must be bytes or buffer, not strHow can i compress my string …

How to set color of text using xlwt

I havent been able to find documentation on how to set the color of text. How would the following be done in xlwt?style = xlwt.XFStyle()# bold font = xlwt.Font() font.bold = True style.font = font# ba…

How to apply linregress in Pandas bygroup

I would like to apply a scipy.stats.linregress within Pandas ByGroup. I had looked through the documentation but all I could see was how to apply something to a single column like grouped.agg(np.sum)or…

Python Shared Memory Array, no attribute get_obj()

I am working on manipulating numpy arrays using the multiprocessing module and am running into an issue trying out some of the code I have run across here. Specifically, I am creating a ctypes array f…