Multi processing code repeatedly runs

2024/9/25 6:24:33

So I wish to create a process using the python multiprocessing module, I want it be part of a larger script. (I also want a lot of other things from it but right now I will settle for this)

I copied the most basic code from the multiprocessing docs and modified it slightly

However, everything outside of the if __name__ == '__main__': statement gets repeated every time p.join() is called.

This is my code:

from multiprocessing import Processdata = 'The Data'
print(data)# worker function definition
def f(p_num):print('Doing Process: {}'.format(p_num))print('start of name == main ')if __name__ == '__main__':print('Creating process')p = Process(target=f, args=(data,))print('Process made')p.start()print('process started')p.join()print('process joined')print('script finished')

This is what I expected:

The Data
start of name == main 
Creating process
Process made
process started
Doing Process: The Data
process joined
script finishedProcess finished with exit code 0

This is the reality:

The Data
start of name == main 
Creating process
Process made
process started
The Data                         <- wrongly repeated line
start of name == main            <- wrongly repeated line
script finished                  <- wrongly executed early line
Doing Process: The Data
process joined
script finishedProcess finished with exit code 0

I am not sure whether this is caused by the if statement or p.join() or something else and by extension why this is happening. Can some one please explain what caused this and why?

For clarity because some people cannot replicate my problem but I have; I am using Windows Server 2012 R2 Datacenter and I am using python 3.5.3.

Answer

The way Multiprocessing works in Python is such that each child process imports the parent script. In Python, when you import a script, everything not defined within a function is executed. As I understand it, __name__ is changed on an import of the script (Check this SO answer here for a better understanding), which is different than if you ran the script on the command line directly, which would result in __name__ == '__main__'. This import results in __name__ not equalling '__main__', which is why the code in if __name__ == '__main__': is not executed for your subprocess.

Anything you don't want executed during subprocess calls should be moved into your if __name__ == '__main__': section of your code, as this will only run for the parent process, i.e. the script you run initially.

Hope this helps a bit. There are some more resources around Google that better explain this if you look around. I linked the official Python resource for the multiprocessing module, and I recommend you look through it.

https://en.xdnf.cn/q/71610.html

Related Q&A

Why use os.setsid() in Python?

I know os.setsid() is to change the process(forked) group id to itself, but why we need it?I can see some answer from Google is: To keep the child process running while the parent process exit.But acc…

How to apply different aggregation functions to same column by using pandas Groupby

It is clear when doingdata.groupby([A,B]).mean()We get something multiindex by level A and B and one column with the mean of each grouphow could I have the count(), std() simultaneously ?so result loo…

Can not connect to an abstract unix socket in python

I have a server written in c++ which creates and binds to an abstract unix socket with a namespace address of "\0hidden". I also have a client which is written in c++ also and this client can…

Pandas display extra unnamed columns for an excel file

Im working on a project using pandas library, in which I need to read an Excel file which has following columns: invoiceid, locationid, timestamp, customerid, discount, tax,total, subtotal, productid, …

Modifying the weights and biases of a restored CNN model in TensorFlow

I have recently started using TensorFlow (TF), and I have come across a problem that I need some help with. Basically, Ive restored a pre-trained model, and I need to modify the weights and biases of o…

Flask SQLAlchemy paginate over objects in a relationship

So I have two models: Article and Tag, and a m2m relationship which is properly set.I have a route of the kind articles/tag/ and I would like to display only those articles related to that tagI have so…

generating correlated numbers in numpy / pandas

I’m trying to generate simulated student grades in 4 subjects, where a student record is a single row of data. The code shown here will generate normally distributed random numbers with a mean of 60 …

AttributeError: list object has no attribute split

Using Python 2.7.3.1I dont understand what the problem is with my coding! I get this error: AttributeError: list object has no attribute splitThis is my code:myList = [hello]myList.split()

Managing multiple Twisted client connections

Im trying to use Twisted in a sort of spidering program that manages multiple client connections. Id like to maintain of a pool of about 5 clients working at one time. The functionality of each clien…

using a conditional and lambda in map

If I want to take a list of numbers and do something like this:lst = [1,2,4,5] [1,2,4,5] ==> [lower,lower,higher,higher]where 3 is the condition using the map function, is there an easy way?Clearly…