Fix jumping of multiple progress bars (tqdm) in python multiprocessing

2024/9/21 12:24:50

I want to parallelize a task (progresser()) for a range of input parameters (L). The progress of each task should be monitored by an individual progress bar in the terminal. I'm using the tqdm package for the progress bars. The following code works on my Mac for up to 23 progress bars (L = list(range(23)) and below), but produces chaotic jumping of the progress bars starting at L = list(range(24)). Has anyone an idea how to fix this?

from time import sleep
import random
from tqdm import tqdm
from multiprocessing import Pool, freeze_support, RLockL = list(range(24)) # works until 23, breaks starting at 24def progresser(n):text = f'#{n}'sampling_counts = 10with tqdm(total=sampling_counts, desc=text, position=n+1) as pbar:for i in range(sampling_counts):sleep(random.uniform(0, 1))pbar.update(1)if __name__ == '__main__':freeze_support()p = Pool(processes=None,initargs=(RLock(),), initializer=tqdm.set_lock)p.map(progresser, L)print('\n' * (len(L) + 1))

As an example of how it should look like in general, I provide a screenshot for L = list(range(16)) below.

multiprocessing progess bars

versions: python==3.7.3, tqdm==4.32.1

Answer

I'm not getting any jumping when I set the size to 30. Maybe you have more processors and can have more workers running.

However, if n grows large you will start to see jumps because of the nature of the chunksize.

I.e p.map will split your input into chunksizes and give each process a chunk. So as n grows larger, so does your chunksize, and so does your ....... yup position (pos=n+1)!

Note: Although map preserves the order of the results returned. The order its computed is arbitrary.

As n grows large I would suggest using processor id as the position to view progress on a per process basis.

from time import sleep
import random
from tqdm import tqdm
from multiprocessing import Pool, freeze_support, RLock
from multiprocessing import current_processdef progresser(n):text = f'#{n}'sampling_counts = 10current = current_process()pos = current._identity[0]-1with tqdm(total=sampling_counts, desc=text, position=pos) as pbar:for i in range(sampling_counts):sleep(random.uniform(0, 1))pbar.update(1)if __name__ == '__main__':freeze_support()L = list(range(30)) # works until 23, breaks starting at 24# p = Pool(processes=None,#         initargs=(RLock(),), initializer=tqdm.set_lock#         )with Pool(initializer=tqdm.set_lock, initargs=(tqdm.get_lock(),)) as p: p.map(progresser, L)print('\n' * (len(L) + 1))
https://en.xdnf.cn/q/72058.html

Related Q&A

How to access data stored in QModelIndex

The code below create a single QListView with the data and proxy models "attached". Clicking one of the radio buttons calls for buttonClicked() function. This function calls models .data(inde…

Pythons read and write add \x00 to the file

I have come across a weird problem when working with files in python. Lets say I have a text file and a simple piece of code that reads the contents of the file and then rewrites it with unaltered cont…

NetworkX remove attributes from a specific node

I am having a problem with networkX library in python. I build a graph that initialises some nodes, edges with attributes. I also developed a method that will dynamic add a specific attribute with a sp…

How to use Python left outer join using FOR/LIST/DICTIONARY comprehensions (not SQL)?

I have two tuples, details below:t1 = [ [aa], [ff], [er] ]t2 = [ [aa, 11,], [er, 99,] ]and I would like to get results like these below using python method similar to SQLs LEFT OUTER JOIN:res = [ [aa, …

GaussianMixture initialization using component parameters - sklearn

I want to use sklearn.mixture.GaussianMixture to store a gaussian mixture model so that I can later use it to generate samples or a value at a sample point using score_samples method. Here is an exampl…

How to use geopy vicenty distance over dataframe columns?

I have a dataframe with location column which contains lat,long location as followsdeviceid location 1102ADb75 [12.9404578177, 77.5548244743]How to get the di…

Opening a postgres connection in psycopg2 causes python to crash

Im getting the following error message when I try to open up a connection to a postgres database. Perhaps its related to OpenSSL, but I cant understand the error message. Can anyone help?>>>…

Calling generated `__init__` in custom `__init__` override on dataclass

Currently I have something like this: @dataclass(frozen=True) class MyClass:a: strb: strc: strd: Dict[str, str]...which is all well and good except dicts are mutable, so I cant use my class to key anot…

Python Watchdog process existing files on startup

I have a simple Watchdog and Queue process to monitor files in a directory. Code taken from https://camcairns.github.io/python/2017/09/06/python_watchdog_jobs_queue.htmlimport time from watchdog.events…

Updating Text In Entry (Tkinter)

The piece of code below takes input from user through a form and then returns the input as multiplied by 2. What I want to do is, when a user types a number (for example 5) and presses the "Enter&…