I have a data frame like this.
mydf = pd.DataFrame({'a':[1,1,3,3],'b':[np.nan,2,3,6],'c':[1,3,3,9]})a b c
0 1 NaN 1
1 1 2.0 3
2 3 3.0 3
3 3 6.0 9
I would like to have a resulting dataframe like this.
myResults = pd.concat([mydf.groupby('a').apply(lambda x: (x.b/x.c).max()), mydf.groupby('a').apply(lambda x: (x.b/x.c).min())], axis =1)
myResults.columns = ['max','min']max min
a
1 0.666667 0.666667
3 1.000000 0.666667
Basically i would like to have max and min of ratio of column b
and column c
for each group (grouped by column a
)
If it possible to achieve this by agg
?
I tried mydf.groupby('a').agg([lambda x: (x.b/x.c).max(), lambda x: (x.b/x.c).min()])
. It will not work, and seems column name b
and c
will not be recognized.
Another way i can think of is to add the ratio column first to mydf
. i.e. mydf['ratio'] = mydf.b/mydf.c
, and then use agg
on the updated mydf
like mydf.groupby('a')['ratio'],agg[max,min]
.
Is there a better way to achieve this through agg or other function? In summary, I would like to apply customized function to grouped DataFrame, and the customized function needs to read multiple columns from original DataFrame.