Is there any opportunity in pandas to groupby data by MultiIndex? By this i mean passing to groupby function not only keys but keys and values to predefine dataframe columns?
a = np.array(['foo', 'foo', 'foo', 'bar', 'bar', 'foo', 'foo'], dtype=object)
b = np.array(['one', 'one', 'two', 'one', 'two', 'two', 'two'], dtype=object)
c = np.array(['dull', 'shiny', 'dull', 'dull', 'dull', 'shiny', 'shiny'], dtype=object)
df = pd.DataFrame([a, b, c]).T
df.columns = ['a', 'b', 'c']
df.groupby(['a', 'b', 'c']).apply(len)a b c
bar one dull 1two dull 1
foo one dull 1shiny 1two dull 1shiny 2
But what I actually want is the following:
mi = pd.MultiIndex(levels=[['foo', 'bar'], ['one', 'two'], ['dull', 'shiny']],labels=[[0, 0, 0, 0, 1, 1, 1, 1], [0, 0, 1, 1, 0, 0, 1, 1], [0, 1, 0, 1, 0, 1, 0, 1]])
#pseudocode
df.groupby(['a', 'b', 'c'], multi_index = mi).apply(len)
a b c
bar one dull 1shiny 0two dull 1shiny 0
foo one dull 1shiny 1two dull 1shiny 2
The way i see it is in creation of additional wrapper on groupby object. Or maybe this feature feets well to pandas philosophy and it can be included in the pandas lib?