Question 1

Workflow is :

Read CSV file using Python's pandas library and get Variation Column
Variation Column data is

Variation
----------
Color Family : Black,  Size:Int:L
Color Family : Blue, Size:Int:M
Color Family : Red, Size:Int:Xl

But I want to print this data in different column with sorted data and save its as a xlsx

Color        Size
-------      ------
Black           L
Blue            M
Red             XL

My code is :

#spliting variation data#taking variation column data in a var
get_variation = df_new['Variation'] #Splitting Column data
for part in get_variation:format_part = str(part)data = re.split(r'[:,]' , format_part)df_new['Color'] = data[1]df_new['Size'] = data[4]

But my output is coming as

Color     Size
------   ------
Black      L
Black      L
Black      L

Question 2

Without mapping or iteration you can done using str .replace() function using pandas. Try this,

#replacing "Color Family : " to "" 
df['Variation'] = df['Variation'].str.replace("Color Family : ","") #replacing " Size:Int:" to "" 
df['Variation'] = df['Variation'].str.replace(" Size:Int:","")#splitting by , to seperate (Color and Size) then Expanding Columns
df[['Color', 'Size']] = df['Variation'].str.split(',', expand = True)
print(df)

The output will be,

enter image description here

How to fix pandas column data

Related Q&A

Connect to Oracle Database and export data as CSV using Python

Pandas data frame: convert Int column into binary in python

I have an issue : Reading Multiple Text files using Multi-Threading by python

How to print \ in python?

Replace word, but another word with same letter format got replaced

Python: Split timestamp by date and hour

ModuleNotFoundError: when importing curses in IDE

Add new column in a csv file and manipulate on the on records

Xpath returns null

I am getting an Index error as list out of range. I have to scan through many lines