Python threading vs. multiprocessing in Linux

2024/10/15 7:24:39

Based on this question I assumed that creating new process should be almost as fast as creating new thread in Linux. However, little test showed very different result. Here's my code:

from multiprocessing import Process, Pool
from threading import Threadtimes = 1000def inc(a):b = 1return a + bdef processes():for i in xrange(times):p = Process(target=inc, args=(i, ))p.start()p.join()def threads():for i in xrange(times):t = Thread(target=inc, args=(i, ))t.start()t.join()

Tests:

>>> timeit processes() 
1 loops, best of 3: 3.8 s per loop>>> timeit threads() 
10 loops, best of 3: 98.6 ms per loop

So, processes are almost 40 times slower to create! Why does it happen? Is it specific to Python or these libraries? Or did I just misinterpreted the answer above?


UPD 1. To make it more clear. I understand that this piece of code doesn't actually introduce any concurrency. The goal here is to test the time needed to create a process and a thread. To use real concurrency with Python one can use something like this:

def pools():pool = Pool(10)pool.map(inc, xrange(times))

which really runs much faster than threaded version.


UPD 2. I have added version with os.fork():

for i in xrange(times):child_pid = os.fork()if child_pid:os.waitpid(child_pid, 0)else:exit(-1)

Results are:

$ time python test_fork.py real    0m3.919s
user    0m0.040s
sys     0m0.208s$ time python test_multiprocessing.py real    0m1.088s
user    0m0.128s
sys     0m0.292s$ time python test_threadings.pyreal    0m0.134s
user    0m0.112s
sys     0m0.048s
Answer

The question you linked to is comparing the cost of just calling fork(2) vs. pthread_create(3), whereas your code does quite a bit more, e.g. using join() to wait for the processes/threads to terminate.

If, as you say...

The goal here is to test the time needed to create a process and a thread.

...then you shouldn't be waiting for them to complete. You should be using test programs more like these...

fork.py

import os
import timedef main():for i in range(100):pid = os.fork()if pid:#print 'created new process %d' % pidcontinueelse:time.sleep(1)returnif __name__ == '__main__':main()

thread.py

import thread
import timedef dummy():time.sleep(1)def main():for i in range(100):tid = thread.start_new_thread(dummy, ())#print 'created new thread %d' % tidif __name__ == '__main__':main()

...which give the following results...

$ time python fork.py
real    0m0.035s
user    0m0.008s
sys     0m0.024s$ time python thread.py
real    0m0.032s
user    0m0.012s
sys     0m0.024s

...so there's not much difference in the creation time of threads and processes.

https://en.xdnf.cn/q/69315.html

Related Q&A

How to create a visualization for events along a timeline?

Im building a visualization with Python. There Id like to visualize fuel stops and the fuel costs of my car. Furthermore, car washes and their costs should be visualized as well as repairs. The fuel c…

Multiplying Numpy 3D arrays by 1D arrays

I am trying to multiply a 3D array by a 1D array, such that each 2D array along the 3rd (depth: d) dimension is calculated like:1D_array[d]*2D_arrayAnd I end up with an array that looks like, say:[[ [1…

Django Performing System Checks is running very slow

Out of nowhere Im running into an issue with my Django application where it runs the "Performing System Checks" command very slow. If I start the server with python manage.py runserverIt take…

str.translate vs str.replace - When to use which one?

When and why to use the former instead of the latter and vice versa?It is not entirely clear why some use the former and why some use the latter.

python BeautifulSoup searching a tag

My first post here, Im trying to find all tags in this specific html and i cant get them out, this is the code:from bs4 import BeautifulSoup from urllib import urlopenurl = "http://www.jutarnji.h…

How to remove extra whitespace from image in opencv? [duplicate]

This question already has answers here:How to remove whitespace from an image in OpenCV?(3 answers)Closed 3 years ago.I have the following image which is a receipt image and a lot of white space aroun…

Is there a way in numpy to test whether a matrix is Unitary

I was wondering if there is any function in numpy to determine whether a matrix is Unitary?This is the function I wrote but it is not working. I would be thankful if you guys can find an error in my f…

Two unique marker symbols for one legend

I would like to add a "red filled square" symbol beside the "red filled circle" symbol under legend. How do I achieve this? I prefer to stick with pyplot rather than pylab. Below i…

What is Rubys equivalent to Pythons multiprocessing module?

To get real concurrency in Ruby or Python, I need to create new processes. Python makes this pretty straightforward using the multiprocessing module, which abstracts away all the fork / wait goodness a…

Using grep in python

There is a file (query.txt) which has some keywords/phrases which are to be matched with other files using grep. The last three lines of the following code are working perfectly but when the same comma…