I have a simple dataframe df with a column of lists lists
. I would like to generate an additional column based on lists
.
The df
looks like:
import pandas as pd
lists={1:[[1]],2:[[1,2,3]],3:[[2,9,7,9]],4:[[2,7,3,5]]}
#create test dataframe
df=pd.DataFrame.from_dict(lists,orient='index')
df=df.rename(columns={0:'lists'})
dflists
1 [1]
2 [1, 2, 3]
3 [2, 9, 7, 9]
4 [2, 7, 3, 5]
I would like df
to look like this:
df
Out[9]: lists rolllists
1 [1] [1]
2 [1, 2, 3] [1, 1, 2, 3]
3 [2, 9, 7, 9] [1, 2, 3, 2, 9, 7, 9]
4 [2, 7, 3, 5] [2, 9, 7, 9, 2, 7, 3, 5]
Basically I want to 'sum'/append
the rolling 2 lists. Note that row 1, because I only have 1 list 1, rolllists is that list. But in row 2, I have 2 lists that I want appended. Then for row three, append df[2].lists
and df[3].lists
etc. I have worked on similar things before, reference this:Pandas Dataframe, Column of lists, Create column of sets of cumulative lists, and record by record differences.
In addition, if we can get this part above, then I want to do this in a groupby
(so the example below would be 1 group for example, so for instance the df
might look like this in the groupby
):
Group lists rolllists
1 A [1] [1]
2 A [1, 2, 3] [1, 1, 2, 3]
3 A [2, 9, 7, 9] [1, 2, 3, 2, 9, 7, 9]
4 A [2, 7, 3, 5] [2, 9, 7, 9, 2, 7, 3, 5]
5 B [1] [1]
6 B [1, 2, 3] [1, 1, 2, 3]
7 B [2, 9, 7, 9] [1, 2, 3, 2, 9, 7, 9]
8 B [2, 7, 3, 5] [2, 9, 7, 9, 2, 7, 3, 5]
I have tried various things like df.lists.rolling(2).sum() and I get this error:
TypeError: cannot handle this type -> object
in Pandas 0.24.1 and unfortunatley in Pandas 0.22.0 the command doesn't error, but instead returns the exact same values as in lists
. So Looks like newer versions of Pandas can't sum lists? That's a secondary issue.
Love any help! Have Fun!