How to reference groupby index when using apply, transform, agg - Python Pandas?

2024/9/27 17:23:39

To be concrete, say we have two DataFrames:

df1:

    date    A
0   12/1/14 3
1   12/1/14 1
2   12/3/14 2
3   12/3/14 3
4   12/3/14 4
5   12/6/14 5

df2:

        B
12/1/14 10
12/2/14 20
12/3/14 10
12/4/14 30
12/5/14 10
12/6/14 20

Now I want to groupby date in df1, and take a sum of value A in each group and then normalize it by the value of B in df2 in the corresponding date. Something like this

df1.groupby('date').agg(lambda x: np.sum(x)/df2.loc[x.date,'B'])

The question is that neither aggregate, apply, nor transform can reference to the index. Any idea how to work around this?

Answer

When you call .groupby('column') it makes column to be part of DataFrameGroupBy index. And it is accessible through .index property.

So, in your case, assuming that date is NOT part of index in either df this should work:

def f(x):return x.sum() / df2.set_index('date').loc[x.index[0], 'B']df1.set_index('date').groupby(level='date').apply(f)

This produces:

               A
date            
2014-01-12  0.40
2014-03-12  0.90
2014-06-12  0.25

If date is in index of df2 - just use df2.loc[x.index[0], 'B'] in the above code.

If date is in df1.index change the last line to df1.groupby(level='date').apply(f).

https://en.xdnf.cn/q/71435.html

Related Q&A

Google AppEngine Endpoints Error: Fetching service config failed (status code 404)

I am implementing the steps in the Quickstart.I did notice another question on this. I double checked that env_variables section in app.yaml has the right values for ENDPOINTS_SERVICE_NAME and ENDPOIN…

How to unload a .NET assembly reference in IronPython

After loading a reference to an assembly with something like:import clr clr.AddRferenceToFileAndPath(rC:\foo.dll)How can I unload the assembly again?Why would anyone ever want to do this? Because Im …

Bad key axes.prop_cycle Error while using an mplstyle in matplotlib (Python)

I am getting the following error when I try to use an external style sheet loaded locally. Bad key "axes.prop_cycle" on line 270 in idt.mplstyle. You probably need to get an updated matplotli…

Dollar notation in script languages - why? [closed]

Closed. This question is off-topic. It is not currently accepting answers.Want to improve this question? Update the question so its on-topic for Stack Overflow.Closed 12 years ago.Improve this questio…

Failure to build wheel / Error: INCLUDE Environment Variable is empty

I am using Python 2.7.11 and am trying to pip install modules however a few of them are failing. The message I get is "Failure to build wheel for X" and "Error: INCLUDE Environment Varia…

Calculating the position of QR Code alignment patterns

I need to know how to calculate the positions of the QR Code alignment patterns as defined in the table of ISO/IEC 18004:2000 Annex E.I dont understand how its calculated. If you take the Version 16, f…

Lowlevel introspection in python3?

Is there some introspection method allowing to reliably obtain the underlying data structure of an object instance, that is unaffected by any customizations? In Python 3 an objects low-level implement…

Efficiently find indices of nearest points on non-rectangular 2D grid

I have an irregular (non-rectangular) lon/lat grid and a bunch of points in lon/lat coordinates, which should correspond to points on the grid (though they might be slightly off for numerical reasons).…

How to code a sequence to sequence RNN in keras?

I am trying to write a sequence to sequence RNN in keras. I coded this program using what I understood from the web. I first tokenized the text then converted the text into sequence and padded to form …

Error when installing psycopg2 on Windows 10

Collecting psycopg2Using cached psycopg2-2.6.1.tar.gzComplete output from command python setup.py egg_info:running egg_infocreating pip-egg-info\psycopg2.egg-infowriting pip-egg-info\psycopg2.egg-info\…