pandas groupby: can I select an agg function by one level of a column MultiIndex?

2024/5/20 5:09:31

I have a pandas DataFrame with a MultiIndex of columns:

columns=pd.MultiIndex.from_tuples([(c, i) for c in ['a', 'b'] for i in range(3)])
df = pd.DataFrame(np.random.randn(4, 6),index=[0, 0, 1, 1],columns=columns)
print(df)a                             b                    0         1         2         0         1         2
0  0.582804  0.753118 -0.900950 -0.914657 -0.333091 -0.965912
0  0.498002 -0.842624  0.155783  0.559730 -0.300136 -1.211412
1  0.727019  1.522160  1.679025  1.738350  0.593361  0.411907
1  1.253759 -0.806279 -2.177582 -0.099210 -0.839822 -0.211349

I want to group by the index, and use the 'min' aggregation on the a columns, and the 'sum' aggregation on the b columns.

I know I can do this by creating a dict that specifies the agg function for each column:

agg_dict = {'a': 'min', 'b': 'sum'}
full_agg_dict = {(c, i): agg_dict[c] for c in ['a', 'b'] for i in range(3)}
print(df.groupby(level=0).agg(full_agg_dict))a                             b                    0         1         2         0         1         2
0  0.498002 -0.842624 -0.900950 -0.354927 -0.633227 -2.177324
1  0.727019 -0.806279 -2.177582  1.639140 -0.246461  0.200558

Is there a simpler way? It seems like there should be a way to do this with agg_dict without using full_agg_dict.

Answer

I would use your approach as well. But here's another way that (should) work:

(df.stack(level=1).groupby(level=[0,1]).agg({'a':'min','b':'sum'}).unstack(-1)
)

For some reason groupby(level=[0,1] doesn't work for me, so I came up with:

(df.stack(level=1).reset_index().groupby(['level_0','level_1']).agg({'a':'min','b':'sum'}).unstack('level_1')
)
https://en.xdnf.cn/q/72723.html

Related Q&A

Bottle web app not serving static css files

My bottle web application is not serving my main.css file despite the fact I am using the static_file method.app.pyfrom bottle import * from xml.dom import minidom @route(/) def index():return template…

How to wrap text in OpenCV when I print it on an image and it exceeds the frame of the image?

I have a 1:1 ratio image and I want to make sure that if the text exceeds the frame of the image, it gets wrapped to the next line. How would I do it?I am thinking of doing an if-else block, where &qu…

pandas series filtering between values

If s is a pandas.Series, I know I can do this:b = s < 4or b = s > 0but I cant dob = 0 < s < 4orb = (0 < s) and (s < 4)What is the idiomatic pandas method for creating a boolean series…

python os.path.exists reports False when files is there

Hi have an application which is sometimes reporting that a file does not exist even when it does, I am using os.path.exists and the file is on a mounted network share. I am on OSX Yosemite, python 2.7.…

Python unhashable type: numpy.ndarray

I worked on making functions for K Nearest Neighbors. I have tested each function separately and they all work well. However whenever I put them together and run KNN_method, it shows unhashable type: n…

Efficient way to generate Lime explanations for full dataset

Am working on a binary classification problem with 1000 rows and 15 features. Currently am using Lime to explain the predictions of each instance. I use the below code to generate explanations for full…

how to handle javascript alerts in selenium using python

So I there is this button I want to click and if its the first time youve clicked it. A javascript alert popup will appear. Ive been using firebug and just cant find where that javascript is located an…

testing.postgresql command not found: initdb inside docker

Hi im trying to make a unittest with postgresql database that use sqlalchemy and alembicAlso im running it on docker postgresqlIm following the docs of testing.postgresql(docs) to set up a temporary po…

Recommended approach for loading CouchDB design documents in Python?

Im very new to couch, but Im trying to use it on a new Python project, and Id like to use python to write the design documents (views), also. Ive already configured Couch to use the couchpy view server…

Error when import matplotlib.pyplot as plt

I did not have any problem to use "plt", but it suddenly shows an error message and does not work, when I import it. Please see the below. >>> import matplotlib >>> import m…