Question 1

I have a series that looks like this:

            delivery
2007-04-26  706           23
2007-04-27  705           10706         1089708           83710           13712           51802            4806            1812            3
2007-04-29  706           39708            4712            1
2007-04-30  705            3706         1016707            2
...
2014-11-04  1412          531501           11502           11512           1
2014-11-05  1411          471412        13341501          401502         4331504         1261506         1001508           71510           61512          511604           11612           5
Length: 26255, dtype: int64

where the query is: df.groupby([df.index.date, 'delivery']).size()

For each day, I need to pull out the delivery number which has the most volume. I feel like it would be something like:

df.groupby([df.index.date, 'delivery']).size().idxmax(axis=1)

However, this just returns me the idxmax for the entire dataframe; instead, I need the second-level idmax (not the date but rather the delivery number) for each day, not the entire dataframe (ie. it returns a vector).

Any ideas on how to accomplish this?

Question 2

Your example code doesn't work because the idxmax is executed after the groupby operation (so on the whole dataframe)

I'm not sure how to use idxmax on multilevel indexes, so here's a simple workaround.

Setting up data :

import pandas as pd
d= {'Date': ['2007-04-26', '2007-04-27', '2007-04-27', '2007-04-27','2007-04-27', '2007-04-28', '2007-04-28'], 'DeliveryNb': [706, 705, 708, 450, 283, 45, 89],'DeliveryCount': [23, 10, 1089, 82, 34, 100, 11]}df = pd.DataFrame.from_dict(d, orient='columns').set_index('Date')
print df

output

            DeliveryCount  DeliveryNb
Date                                 
2007-04-26             23         706
2007-04-27             10         705
2007-04-27           1089         708
2007-04-27             82         450
2007-04-27             34         283
2007-04-28            100          45
2007-04-28             11          89

creating custom function :

The trick is to use the reset_index() method (so you easily get the integer index of the group)

def func(df):idx = df.reset_index()['DeliveryCount'].idxmax()return df['DeliveryNb'].iloc[idx]

applying it :

g = df.groupby(df.index)
g.apply(func)

result :

Date
2007-04-26    706
2007-04-27    708
2007-04-28     45
dtype: int64

Python pandas idxmax for multiple indexes in a dataframe

Related Q&A

No of Pairs of consecutive prime numbers having difference of 6 like (23,29) from 1 to 2 billion

Building a docker image for a flask app fails in pip

Why is numba so fast?

How to create a field with a list of foreign keys in SQLAlchemy?

Implementing a recursive algorithm in pyspark to find pairings within a dataframe

Python Delegate Pattern - How to avoid circular reference?

Render Jinja after jQuery AJAX request to Flask

shape-preserving piecewise cubic interpolation for 3D curve in python

ForeignKey vs OneToOne field django [duplicate]

How to sort glob.glob numerically?