Create dataframe from dictionary of list with variable length

2024/9/25 0:59:42

I have a dictionary of list which is like -

from collections import defaultdict
defaultdict(list,{'row1': ['Affinity'],'row2': ['Ahmc','Garfield','Medical Center'],'row3': ['Alamance','Macbeth'],'row4': [],'row5': ['Mayday']})

I want to convert this to a data frame. The output should look like-

ID  SYN1    SYN2    SYN3    SYN4    SYN5
row1    Affinity                
row2    Ahmc    Garfield    Medical Center      
row3    Alamance    Macbeth         
row4                    
row5    Mayday
Answer

collections.defaultdict is a subclass of dict.

So you can just use pd.DataFrame.from_dict:

res = pd.DataFrame.from_dict(dd, orient='index')
res.columns = [f'SYN{i+1}' for i in res]print(res)SYN1      SYN2            SYN3
row1  Affinity      None            None
row2      Ahmc  Garfield  Medical Center
row3  Alamance   Macbeth            None
row4      None      None            None
row5    Mayday      None            None
https://en.xdnf.cn/q/71621.html

Related Q&A

How to standardize ONE column in Spark using StandardScaler?

I am trying to standardize (mean = 0, std = 1) one column (age) in my data frame. Below is my code in Spark (Python):from pyspark.ml.feature import StandardScaler from pyspark.ml.feature import VectorA…

Pandas Dataframe - select columns with a specific value in a specific row

I want to select columns with a specific value (say 1) in a specific row (say first row) for Pandas Dataframe

PermissionError: [Errno 13] Permission denied in Django

I have encountered a very strange problem.Im working with django, I create a directory on server, and try to save pickle file into it, this way:with open(path, wb) as output: pickle.dump(obj, output, p…

Evaluating Jacobian at specific points using sympy

I am trying to evaluate the Jacobian at (x,y)=(0,0) but unable to do so. import sympy as sp from sympy import * import numpy as np x,y=sp.symbols(x,y, real=True) J = Function(J)(x,y) f1=-y f2=x - 3*y*(…

PyAudio cannot find any output devices

When I run:import pyaudio pa = pyaudio.PyAudio() pa.get_default_output_device_info()I get:IOError: No Default Output Device AvailableWhen I say:pa.get_device_count()It returns 0L.And of course if I lis…

How do I write a Hybrid Property that depends on a column in children relationship?

Lets say I have two tables (using SQLAlchemy) for parents and children:class Child(Base):__tablename__ = Childid = Column(Integer, primary_key=True) is_boy = Column(Boolean, default=False)parent_id = C…

Python insert a line break in a string after character X

What is the python syntax to insert a line break after every occurrence of character "X" ? This below gave me a list object which has no split attribute error for myItem in myList.split…

How to use properly Tensorflow Dataset with batch?

I am new to Tensorflow and deep learning, and I am struggling with the Dataset class. I tried a lot of things and I can’t find a good solution. What I am trying I have a large amount of images (500k+)…

How to handle a huge stream of JSON dictionaries?

I have a file that contains a stream of JSON dictionaries like this:{"menu": "a"}{"c": []}{"d": [3, 2]}{"e": "}"}It also includes nested dict…

datatype for handling big numbers in pyspark

I am using spark with python.After uploading a csv file,I needed to parse a column in a csv file which has numbers that are 22 digits long. For parsing that column I used LongType() . I used map() func…