It is clear when doing
data.groupby(['A','B']).mean()
We get something multiindex by level 'A' and 'B' and one column with the mean of each group
how could I have the count(), std() simultaneously ?
so result looks like in a dataframe
A B mean count std
The following should work:
data.groupby(['A','B']).agg([pd.Series.mean, pd.Series.std, pd.Series.count])
basically call agg
and passing a list of functions will generate multiple columns with those functions applied.
Example:
In [12]:df = pd.DataFrame({'a':np.random.randn(5), 'b':[0,0,1,1,2]})
df.groupby(['b']).agg([pd.Series.mean, pd.Series.std, pd.Series.count])
Out[12]:a mean std count
b
0 -0.769198 0.158049 2
1 0.247708 0.743606 2
2 -0.312705 NaN 1
You can also pass the string of the method names, the common ones work, some of the more obscure ones don't I can't remember which but in this case they work fine, thanks to @ajcr for the suggestion:
In [16]:
df = pd.DataFrame({'a':np.random.randn(5), 'b':[0,0,1,1,2]})
df.groupby(['b']).agg(['mean', 'std', 'count'])Out[16]:a mean std count
b
0 -1.037301 0.790498 2
1 -0.495549 0.748858 2
2 -0.644818 NaN 1