I'd like to add subtotal rows for index #1 (ie. Fruits and Animal) and subtotal columns for columns (ie. 2015 and 2016).
For the subtotal columns, I could do something like this, but it seems inefficient to run this type of code for each year (2015 & 2016). Is there a better way? I don't think 'margins' will work because there are multiple subtotals.
df[('2015','2015_Total')] = df[('2015','1st')]+df[('2015','2nd')]
For the subtotal rows (e.g., fruits total and animal total), I'm not sure where to begin.
It is very complicated, because you need create Multiindex
in columns
and index
.
Create subtotals is easy - use groupby
with sum
. Then create Multiindex and last concat
new columns to original DataFrame
. Last you have to sort_index
(I add Total_
before value for correct sorting):
print df2015_____ 2016_______ 1st 2nd 1st 2nd
Fruits Apple 10 9 11 10Banana 20 22 21 20
Animal Lion 5 3 2 1Tiger 2 3 5 0df1 = df.groupby(level=0, axis=1).sum()
print df12015_____ 2016_______
Fruits Apple 19 21Banana 42 41
Animal Lion 8 3Tiger 5 5print df.columns.get_level_values(0).to_series().drop_duplicates().tolist()
['2015_____', '2016_______']#change index to multiindex
new_columns = zip(df.columns.get_level_values(0).to_series().drop_duplicates().tolist(),"Total_" + df1.columns.str[:4])
print new_columns
[('2015_____', 'Total_2015'), ('2016_______', 'Total_2016')]df1.columns = pd.MultiIndex.from_tuples(new_columns)
print df12015_____ 2016_______Total_2015 Total_2016
Fruits Apple 19 21Banana 42 41
Animal Lion 8 3Tiger 5 5df = pd.concat([df,df1], axis=1)
df2 = df.groupby(level=0, sort=False).sum()
print df22015_____ 2016_______ 2015_____ 2016_______1st 2nd 1st 2nd Total_2015 Total_2016
Animal 7 6 7 1 13 8
Fruits 30 31 32 30 61 62print df.index.levels[0][df.columns.labels[0]].to_series().drop_duplicates().tolist()
['Animal', 'Fruits']#change index to multiindex
new_idx=zip(df.index.levels[0][df.columns.labels[0]].to_series().drop_duplicates().tolist(),"Total_" + df2.index )
print new_idx
[('Animal', 'Total_Animal'), ('Fruits', 'Total_Fruits')]df2.index = pd.MultiIndex.from_tuples(new_idx)
print df22015_____ 2016_______ 2015_____ 2016_______1st 2nd 1st 2nd Total_2015 Total_2016
Animal Total_Animal 7 6 7 1 13 8
Fruits Total_Fruits 30 31 32 30 61 62df = pd.concat([df,df2])
df = df.sort_index(axis=1).sort_index()
print df2015_____ 2016_______ 1st 2nd Total_2015 1st 2nd Total_2016
Animal Lion 5 3 8 2 1 3Tiger 2 3 5 5 0 5Total_Animal 7 6 13 7 1 8
Fruits Apple 10 9 19 11 10 21Banana 20 22 42 21 20 41Total_Fruits 30 31 61 32 30 62