Make for loop execute parallely with Pandas columns

2024/10/5 20:27:31

Please convert below code to execute parallel, Here I'm trying to map nested dictionary with pandas column values. The below code works perfectly but consumes lot of time. Hence looking to parallelize the for loop(Note: df.replace(Source_Dictionary) also did the job but takes triple the time of below code).

df = pd.DataFrame({'one':['bab'],'two':['abb'],'three':['bb']})
Source_Dictionary = {'one':{'dadd':1,'bab':1.5},'two':{'ab':2},'three':{'cc':1,'bb':3}}
required_columns = ['one','two','three']
def Feature_Map(x):df[x] = df[x].map(Source_Dictionary[x]).fillna(0)for i in required_columns:Feature_Map(i)
print(df)one  two  three
0  1.5  0.0      3
Answer

To speed up your execution you can use multi processing. Number of processes and its performance depends on the resource provided. Let's suppose you can afford 4 processes to be running in parallel.

Your function:

def Feature_Map(x):
df[x] = df[x].map(Source_Dictionary[x]).fillna(0)

Multi processing:

from multiprocessing.pool import ThreadPool
pool = ThreadPool(processes=4)
for i in required_columns:pool.apply_async(Feature_Map, (i))

You can also implement code for waiting till the process has finished execution before exiting.

You can refer to https://docs.python.org/2/library/multiprocessing.html for detailed usage.

https://en.xdnf.cn/q/119031.html

Related Q&A

Pre-calculate Excel formulas when exporting data with python?

The code is pulling and then putting the excel formulas and not the calculated data of the formula. xxtab.write(8, 3, "=H9+I9")When this is read in and stored in that separate file, it is sto…

Validating Tkinter Entry Box

I am trying to validate my entry box, that only accepts floats, digits, and operators (+-, %). But my program only accepts numbers not symbols. I think it is a problem with my conditions or Python Rege…

Imaplib with GMail offsets uids

Im querying my gmail inbox using pythons ImapLib with a range parameter, but my returned uids are offset from what I request. My request is as follows:M = imaplib.IMAP4_SSL(imap.gmail.com) M.login(USER…

Accessing a folder containing .wav files [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.This question was caused by a typo or a problem that can no longer be reproduced. While similar q…

What is the right Python idiom for sorting by a single criterion (field or key)? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.Want to improve this question? Update the question so it focuses on one problem only by editing this post.Closed 8…

Incorrect checking of fields in list using a for loop

I have the following code that seeks to read the file contents into a list (this bit works) and then display a message of acceptance, IF the bank details (user and corresponding number) matches. e.g. i…

Why myVar = strings.Fields(scanner.Text()) take much more time than comparable operation in python?

Consider the following code in golangnow := time.Now() sec1 := now.Unix()file, err := os.Open(file_name) if err != nil {log.Fatal(err) } defer file.Close()scanner := bufio.NewScanner(file)var parsedLin…

When reading an excel file in Python can we know which column/field is filtered

I want to capture the field or column name that is filtered in the excel file when reading through python. I saw that we can also capture only the filtered rows by using openpyxl and using hidden == Fa…

Error:__init__() missing 1 required positional argument: rec

I am new to python. I am trying to do microphone file that ought to detect, listen, record and write the .wav files. However, it is giving me an error while I am trying to run the file. It is saying:Ty…

Maya: Connect two Joint chains with Parent Constraint

So here is a snipit of an IK spine builder Ive been working on. Ive figure out how to make lists to duplicate the bound into an IK chain, what Ive got stuck on however is I want my list and for loop to…