How to fix pandas column data

2024/7/4 7:34:46

Workflow is :

  • Read CSV file using Python's pandas library and get Variation Column
  • Variation Column data is
Variation
----------
Color Family : Black,  Size:Int:L
Color Family : Blue, Size:Int:M
Color Family : Red, Size:Int:Xl
  • But I want to print this data in different column with sorted data and save its as a xlsx
Color        Size
-------      ------
Black           L
Blue            M
Red             XL

My code is :

#spliting variation data#taking variation column data in a var
get_variation = df_new['Variation'] #Splitting Column data
for part in get_variation:format_part = str(part)data = re.split(r'[:,]' , format_part)df_new['Color'] = data[1]df_new['Size'] = data[4]

But my output is coming as

Color     Size
------   ------
Black      L
Black      L
Black      L
Answer

Without mapping or iteration you can done using str .replace() function using pandas. Try this,

#replacing "Color Family : " to "" 
df['Variation'] = df['Variation'].str.replace("Color Family : ","") #replacing " Size:Int:" to "" 
df['Variation'] = df['Variation'].str.replace(" Size:Int:","")#splitting by , to seperate (Color and Size) then Expanding Columns
df[['Color', 'Size']] = df['Variation'].str.split(',', expand = True)
print(df)

The output will be,

enter image description here

https://en.xdnf.cn/q/119078.html

Related Q&A

Connect to Oracle Database and export data as CSV using Python

I want to connect oracle database to python and using select statement whatever result i will get, I want that result to be exported as csv file in sftp location. I know we can connect oracle with pyth…

Pandas data frame: convert Int column into binary in python

I have dataframe eg. like below Event[EVENT_ID] = [ 4162, 4161, 4160, 4159,4158, 4157, 4156, 4155, 4154]need to convert each row word to binary. Event[b]=bin(Event[EVENT_ID]) doesnt work TypeError: can…

I have an issue : Reading Multiple Text files using Multi-Threading by python

Hello Friends, I hope someone check my code and helping me on this issue. I want to read from multiple text files (at least 4) sequentially and print their content on the screenFirst time not using Thr…

How to print \ in python?

print "\\"It print me in console...But I want to get string \How to get string string \?

Replace word, but another word with same letter format got replaced

Im trying to replace a word in python, but another word with same letter format got replaced example : initial : bg bgt goal : bang banget current result : bang bangtheres what my code…

Python: Split timestamp by date and hour

I have a list of timestamps in the following format:1/1/2013 3:30I began to learn python some weeks ago and I have no idea how to split the date and time. Can anyone of you help me?Output should be on…

ModuleNotFoundError: when importing curses in IDE

I get the error ModuleNotFoundError: No module named _curses every time I try to uses curses in VS Code or PyCharm. But it works in the command prompt (Im on Windows BTW) Code is from Tech With Tim tut…

Add new column in a csv file and manipulate on the on records

I have 4 csv files named PV.csv, Dwel.csv, Sess.csv, and Elap.csv. I have 15 columns and arouind 2000 rows in each file. At first I would like to add a new column named Var in each file and fill up the…

Xpath returns null

I need to scrape the price of this page: https://www.asos.com/monki/monki-lisa-cropped-vest-top-with-ruched-side-in-black/prd/23590636?colourwayid=60495910&cid=2623 However it is always returning …

I am getting an Index error as list out of range. I have to scan through many lines

import nltk import random from nltk.tokenize import sent_tokenize, word_tokenizefile = open("sms.txt", "r") for line in file:#print linea=word_tokenize(line)if a[5] == SBI and a[6]=…