vagrant@ubuntu-xenial:~/lb/f5/v12$ python
Python 2.7.12 (default, Nov 12 2018, 14:36:49)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> data = [{'name': 'bob', 'age': 20}, {'name': 'jim', 'age': 25}, {'name': 'bob', 'age': 30}]
>>> df = pd.DataFrame(data)
>>> df.set_index(keys='name', drop=False, inplace=True)
>>> dfage name
name
bob 20 bob
jim 25 jim
bob 30 bob
>>> df.to_dict(orient='index')
{'bob': {'age': 30, 'name': 'bob'}, 'jim': {'age': 25, 'name': 'jim'}}
>>>
If we convert the dataframe to a dictionary, the duplicate entry (bob, age 20) is removed. Is there any possible way to produce a dictionary whose values are a list of dictionaries? Something that looks like this?
{'bob': [{'age': 20, 'name': 'bob'}, {'age': 30, 'name': 'bob'}], 'jim': [{'age': 25, 'name': 'jim'}]}
It should be possible to do this if you group on the index.
groupby
Comprehension
{k: g.to_dict(orient='records') for k, g in df.groupby(level=0)}
# {'bob': [{'age': 20, 'name': 'bob'}, {'age': 30, 'name': 'bob'}],
# 'jim': [{'age': 25, 'name': 'jim'}]}
Details
groupby
allows us to partition the data based on unique keys:
for k, g in df.groupby(level=0):print(g, end='\n\n')age name
name
bob 20 bob
bob 30 bobage name
name
jim 25 jim
For each group, convert this into a dictionary using the "records" orient:
for k, g in df.groupby(level=0):print(g.to_dict('r'))[{'age': 20, 'name': 'bob'}, {'age': 30, 'name': 'bob'}]
[{'age': 25, 'name': 'jim'}]
And have it accessible by the grouper key.
GroupBy.apply
+ to_dict
df.groupby(level=0).apply(lambda x: x.to_dict('r')).to_dict()
# {'bob': [{'age': 20, 'name': 'bob'}, {'age': 30, 'name': 'bob'}],
# 'jim': [{'age': 25, 'name': 'jim'}]}
apply
does the same thing that the dictionary comprehension does—it iterates over each group. The only difference is apply
will require one final to_dict
call at the end to dictify the data.