I need to reindex the 2nd level of a pandas dataframe, so that the 2nd level becomes a (complete) list 0,...,(N-1)
for each 1st level index.
- I tried using Allan/Hayden's approach, but unfortunately it only creates an index with as many rows as previously existing.
- What I want is that for each new index, new rows are inserted (with nan values).
Example:
df = pd.DataFrame({'first': ['one', 'one', 'one', 'two', 'two', 'three'], 'second': [0, 1, 2, 0, 1, 1],'value': [1, 2, 3, 4, 5, 6]
})
print dffirst second value
0 one 0 1
1 one 1 2
2 one 2 3
3 two 0 4
4 two 1 5
5 three 1 6# Tried using Allan/Hayden's approach, but no good for this, doesn't add the missing rows
df['second'] = df.reset_index().groupby(['first']).cumcount()
print dffirst second value
0 one 0 1
1 one 1 2
2 one 2 3
3 two 0 4
4 two 1 5
5 three 0 6
My desired result is:
first second value
0 one 0 1
1 one 1 2
2 one 2 3
3 two 0 4
4 two 1 5
4 two 2 nan <-- INSERTED
5 three 0 6
5 three 1 nan <-- INSERTED
5 three 2 nan <-- INSERTED