Flatten list of lists within dictionary values before processing in Pandas

2024/11/15 11:54:36

Issue:

If I need to flatten a list of lists I use something like this list comprehension to flatten into a single list:

[item for sublist in l for item in sublist]

I have a dictionary where some of the values are list of lists and I need to flatten these into single lists prior to importing into Pandas.

Current data:

defaultdict(list,{'object network fake-1': [' host 10.0.0.1'],'object network fake12': [' host 10.0.0.12'],'object network fake2': [' host 10.0.0.2 '],'object network fake3': [' host 10.0.0.0 255.255.255.0'],'object network fake4': [' host 10.0.0.4'],'object network fake5': [' host 10.0.0.5'],'object-group network prt-apps': [' network-object object fake-1',' network-object object fake2',' network-object object fake3',' network-object object fake121'],'object-group network prt-apps2': [' network-object object fake4',' group-object prt-apps',[' network-object object fake-1',' network-object object fake2',' network-object object fake3',' network-object object fake121']],'object-group network prt-apps3': [' network-object object fake5',' group-object prt-apps2',[' network-object object fake4',' group-object prt-apps',[' network-object object fake-1',' network-object object fake2',' network-object object fake3',' network-object object fake121']]]})

Desired data structure:

defaultdict(list,{'object network fake-1': [' host 10.0.0.1'],'object network fake12': [' host 10.0.0.12'],'object network fake2': [' host 10.0.0.2 '],'object network fake3': [' host 10.0.0.0 255.255.255.0'],'object network fake4': [' host 10.0.0.4'],'object network fake5': [' host 10.0.0.5'],'object-group network prt-apps': [' network-object object fake-1',' network-object object fake2',' network-object object fake3',' network-object object fake121'],'object-group network prt-apps2': [' network-object object fake4',' group-object prt-apps',' network-object object fake-1',' network-object object fake2',' network-object object fake3',' network-object object fake121'],'object-group network prt-apps3': [' network-object object fake5',' group-object prt-apps2',' network-object object fake4',' group-object prt-apps',' network-object object fake-1',' network-object object fake2',' network-object object fake3',' network-object object fake121']})

I have searched SO for this and do not see an example that I could use. Is there a simple way to flatten these kind of 'list of list' containers within a dictionary value?

This is the way I have processed other dictionary structures when consuming in Pandas but it does not work with the first dictionary above:

pd.DataFrame(dict([ (k,pd.Series(v)) for k,v in asa.iteritems() ]))
Answer

The following does the job as I understand it (for your specific example this relies on the list + behaviour):

def unpack(l):j = []for i in l:if type(i) != list:j.append(i)else:j = j + unpack(i)return jj = {}
for k, v in l.items():j[k] = unpack(v)

Starting with the object as dict in your example:

l = {'object network fake-1': [' host 10.0.0.1'],'object network fake12': [' host 10.0.0.12'],'object network fake2': [' host 10.0.0.2 '],'object network fake3': [' host 10.0.0.0 255.255.255.0'],'object network fake4': [' host 10.0.0.4'],'object network fake5': [' host 10.0.0.5'],'object-group network prt-apps': [' network-object object fake-1',' network-object object fake2',' network-object object fake3',' network-object object fake121'],'object-group network prt-apps2': [' network-object object fake4',' group-object prt-apps',[' network-object object fake-1',' network-object object fake2',' network-object object fake3',' network-object object fake121']],'object-group network prt-apps3': [' network-object object fake5',' group-object prt-apps2',[' network-object object fake4',' group-object prt-apps',[' network-object object fake-1',' network-object object fake2',' network-object object fake3',' network-object object fake121']]]}

you end up with

j = {'object network fake12': [' host 10.0.0.12'],'object-group network prt-apps': [' network-object object fake-1',' network-object object fake2',' network-object object fake3',' network-object object fake121'],'object network fake-1': [' host 10.0.0.1'],'object network fake2': [' host 10.0.0.2 '],'object network fake3': [' host 10.0.0.0 255.255.255.0'],'object-group network prt-apps2': [' network-object object fake4',' group-object prt-apps',' network-object object fake-1',' network-object object fake2',' network-object object fake3',' network-object object fake121'],'object-group network prt-apps3': [' network-object object fake5',' group-object prt-apps2',' network-object object fake4',' group-object prt-apps',' network-object object fake-1',' network-object object fake2',' network-object object fake3',' network-object object fake121'],'object network fake4': [' host 10.0.0.4'],'object network fake5': [' host 10.0.0.5']}
https://en.xdnf.cn/q/120454.html

Related Q&A

how to analyse and predict(machine learning) a time series data set using scikit-learn for python

i got data-set like this i need to analyse and predict the status column. This is just 2 entrees from the training data set. In this data set there is heart rate pattern(which is collected in 1 second …

Datetime - Strftime and Strptime

Date = datetime.datetime.today().strftime("%d %B %Y") x = datetime.datetime.strptime(Date , "%d %B %Y")returns:2018-05-09 00:00:00instead of: 9 May 2018, what am I doing wrong? Ess…

Subset sum overlapping dilemma recursively

Im studying recursive logic that one of the problems is subset sum. AFAI read, there are overlaps when it is run by recursion. But, I cannot figure out how there are. Also, I read that to overcome the …

Python - download video from indirect url

I have a link like thishttps://r9---sn-4g57knle.googlevideo.com/videoplayback?id=10bc30daeba89d81&itag=22&source=picasa&begin=0&requiressl=yes&mm=30&mn=sn-4g57knle&ms=nxu&a…

Python trading logic

Heres a simple code for downloading daily stock data and computing Bollinger band indicator, but what I am not able to do is set up a logic for generating a buy and sell signal. Can someone help me wit…

Convert PDF to Excel [closed]

Closed. This question needs debugging details. It is not currently accepting answers.Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to repro…

Return the furthermost outlier in kmeans clustering? [closed]

Closed. This question needs debugging details. It is not currently accepting answers.Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to repro…

Highest number of consecutively repeating values in a list

Lets say I have this list:List= [1,1,1,0,0,1,1,1,1,1]How do I display the highest number of repeating 1s in a row? I want to return 5.

Python Maze Generation

I am trying to make a python maze generator but I keep getting an IndexError: list index out of range. Any ideas? Im kinda new to this stuff so I was using the code from rosetta code on maze generatio…

Why does Pythons `any` function not return True or False? [closed]

Closed. This question needs debugging details. It is not currently accepting answers.Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to repro…