I want to use parallel to update global variable using module concurrent.futures in python
It turned out that using ThreadPoolExecutor can update my global variable but the CPU did not use all their potential (always at 5-10%), which is so slow
and ProcessPoolExecutor can use all the cores (at 100%) but my global variable can not be updated because they do not share the same global variable
How can I share my global variable using ProcessPoolExecutor in concurrent.futures model. Thank you a lot for your help
Process doesn't seem like thread that using same memory space. So you need some special way to update variables. ProcessPoolExecutor
uses the multiprocessing
module, the are two ways for sharing data, Shared memory and Server process. First way using shared memory map, Server process using Manager
object that like a proxy to holds sharing data. Server process are more flexible, Shared memory more efficient.
Using Server process sharing data like ThreadPoolExecutor
, just pass arguments to you function.
def running_proxy(mval):# consider lock if you needreturn mval.valuedef start_executor():with multiprocessing.Manager() as manager:executor = ProcessPoolExecutor(max_workers=5)mval = manager.Value('b', 1)futures = [executor.submit(running_proxy, mval) for _ in range(5)]results = [x.result() for x in futures]executor.shutdown()
But Shared memory way has some difference, you need setting shared variable to global.
def running_shared():# consider lock if you needreturn sval.valuedef set_global(args):global svalsval = argsdef start_executor():sval = multiprocessing.Value('b', 1)# for 3.7+executor = ProcessPoolExecutor(max_workers=5, initializer=set_global, initargs=(sval,))# for ~3.6# set_global(sval)# executor = ProcessPoolExecutor(max_workers=5)futures = [executor.submit(running_shared) for _ in range(5)]results = [x.result() for x in futures]executor.shutdown()