Comparison of multi-threading models in Julia =1.3 and Python 3.x

2024/9/16 23:25:39

I would like to understand, from the user point of view, the differences in multithreading programming models between Julia >= 1.3 and Python 3.

Is there one that is more efficient than the other (in the sense that rising the thread numbers reduces more the computational time) ? In which situations (e.g. one model may have an edge, but only on computational or memory intensive tasks) ?

Is one that is more practical/provide higher level functions than the other ?

Is one that is more flexible than the other (e.g. it can be applied to a wider set of cases) ?

Answer

There are several differences between the languages with Julia providing many levels of functionality on this what you can find in Python. You have the following types of parallelism (I am discussing here the standard language features not functionality available via external libraries):

  1. SIMD (signle-instruction-multiple-data) feature of CPUs
  • Julia: combine @simd with @inbounds (see https://docs.julialang.org/en/v1/manual/performance-tips/)
  • Python: not supported
  1. Green threads (also called Coroutines). (This is not an actual threading - but allows to use one system thread across many tasks. This is particularly useful to parallelize IO operations such as web scraping or inter-process communication - for an example if one task is waiting for IO, another tasks can execute in parallel.)
  • Julia: use a combination of @sync (to collect a group of tasks) and @async (to spawn new tasks) macros (for more details see https://docs.julialang.org/en/v1/manual/parallel-computing/)
  • Python: use asyncio (for more details see https://docs.python.org/3/library/asyncio-task.html)
  1. Multihreading: run several tasks in parallel within a single process (and shared memory) across several system threads:
  • Julia: use Threads.@threads macro to parallelize loops and Threads.@spawn to launch tasks on separate system threads. Use locks or atomic values to control the parallel execution. (for more details see https://docs.julialang.org/en/v1/manual/parallel-computing/)

  • Python: not useful for CPU-dominated tasks due to GIL (global-interpreter-lock) (see the comment by @Jim below)

  1. Multi-processing
  • Julia: use macros from the Distibuted package to parallelize loops and spawn remote processes (for more details see https://docs.julialang.org/en/v1/manual/parallel-computing/)

  • Python: use multiprocessing library - for more details see https://docs.python.org/3.8/library/multiprocessing.html

https://en.xdnf.cn/q/72870.html

Related Q&A

How to do multihop ssh with fabric

I have a nat and it has various server So from my local server I want to go to nat and then from nat i have to ssh to other machinesLocalNAT(abcuser@publicIP with key 1)server1(xyzuser@localIP with key…

Python - Converting CSV to Objects - Code Design

I have a small script were using to read in a CSV file containing employees, and perform some basic manipulations on that data.We read in the data (import_gd_dump), and create an Employees object, cont…

Python multithreading - memory not released when ran using While statement

I built a scraper (worker) launched XX times through multithreading (via Jupyter Notebook, python 2.7, anaconda). Script is of the following format, as described on python.org:def worker():while True:i…

Delete files that are older than 7 days

I have seen some posts to delete all the files (not folders) in a specific folder, but I simply dont understand them.I need to use a UNC path and delete all the files that are older than 7 days.Mypath …

Doctests: How to suppress/ignore output?

The doctest of the following (nonsense) Python module fails:""" >>> L = [] >>> if True: ... append_to(L) # XXX >>> L [1] """def append_to(L):…

Matplotlib not showing xlabel in top two subplots

I have a function that Ive written to show a few graphs here:def plot_price_series(df, ts1, ts2):# price series line graphfig = plt.figure()ax1 = fig.add_subplot(221)ax1.plot(df.index, df[ts1], label=t…

SQLAlchemy NOT exists on subselect?

Im trying to replicate this raw sql into proper sqlalchemy implementation but after a lot of tries I cant find a proper way to do it:SELECT * FROM images i WHERE NOT EXISTS (SELECT image_idFROM events …

What is the correct way to obtain explanations for predictions using Shap?

Im new to using shap, so Im still trying to get my head around it. Basically, I have a simple sklearn.ensemble.RandomForestClassifier fit using model.fit(X_train,y_train), and so on. After training, Id…

value error when using numpy.savetxt

I want to save each numpy array (A,B, and C) as column in a text file, delimited by space:import numpy as npA = np.array([5,7,8912,44])B = np.array([5.7,7.45,8912.43,44.99])C = np.array([15.7,17.45,189…

How do I retrieve key from value in django choices field?

The sample code is below:REFUND_STATUS = ((S, SUCCESS),(F, FAIL) ) refund_status = models.CharField(max_length=3, choices=REFUND_STATUS)I know in the model I can retrieve the SUCCESS with method get_re…