running multiple threads in python, simultaneously - is it possible?

2024/11/16 23:00:13

I'm writing a little crawler that should fetch a URL multiple times, I want all of the threads to run at the same time (simultaneously).

I've written a little piece of code that should do that.

import thread
from urllib2 import Request, urlopen, URLError, HTTPErrordef getPAGE(FetchAddress):attempts = 0while attempts < 2:req = Request(FetchAddress, None)try:response = urlopen(req, timeout = 8) #fetching the urlprint "fetched url %s" % FetchAddressexcept HTTPError, e:print 'The server didn\'t do the request.'print 'Error code: ', str(e.code) + "  address: " + FetchAddresstime.sleep(4)attempts += 1except URLError, e:print 'Failed to reach the server.'print 'Reason: ', str(e.reason) + "  address: " + FetchAddresstime.sleep(4)attempts += 1except Exception, e:print 'Something bad happened in gatPAGE.'print 'Reason: ', str(e.reason) + "  address: " + FetchAddresstime.sleep(4)attempts += 1else:try:return response.read()except:"there was an error with response.read()"return Nonereturn Noneurl = ("http://www.domain.com",)for i in range(1,50):thread.start_new_thread(getPAGE, url)

from the apache logs it doesn't seems like the threads are running simultaneously, there's a little gap between requests, it's almost undetectable but I can see that the threads are not really parallel.

I've read about GIL, is there a way to bypass it with out calling a C\C++ code? I can't really understand how does threading is possible with GIL? python basically interpreters the next thread as soon as it finishes with the previous one?

Thanks.

Answer

As you point out, the GIL often prevents Python threads from running in parallel.

However, that's not always the case. One exception is I/O-bound code. When a thread is waiting for an I/O request to complete, it would typically have released the GIL before entering the wait. This means that other threads can make progress in the meantime.

In general, however, multiprocessing is the safer bet when true parallelism is required.

https://en.xdnf.cn/q/71520.html

Related Q&A

Drawing bounding rectangles around multiple objects in binary image in python

I am trying to write some easy code in python to produce bounding rectangles around objects in a binary image, where there may be 1 or more objects. This is fairly easy to achieve with cv2.boundingRec…

Replicating YEARFRAC() function from Excel in Python

So I am using python in order to automate some repetitive tasks I must do in excel. One of the calculations I need to do requires the use of yearfrac(). Has this been replicated in python?I found this…

creating a pandas dataframe from a database query that uses bind variables

Im working with an Oracle database. I can do this much:import pandas as pdimport pandas.io.sql as psqlimport cx_Oracle as odbconn = odb.connect(_user +/+ _pass +@+ _dbenv)sqlStr = "SELECT * FROM c…

Is there a docstring autocompletion tool for jupyter notebook?

I am looking for a tool/extension that helps you writing python docstrings in jupyter notebook. I normally use VS code where you have the autodocstring extension that automatically generates templates …

Long to wide data. Pandas

Im trying to take my dataframe from a long format in which I have a column with a categorical variable, into a wide format in which each category has its own price column. Currently, my data looks like…

How to wrap text in Django admin(set column width)

I have a model Itemclass Item(models.Model):id = models.IntegerField(primary_key=True)title = models.CharField(max_length=140, blank=True)description = models.TextField(blank=True)price = models.Decima…

Problems compiling mod_wsgi in virtualenv

Im trying to compile mod_wsgi (version 3.3), Python 2.6, on a CentOS server - but under virtualenv, with no success. Im getting the error:/usr/bin/ld:/home/python26/lib/libpython2.6.a(node.o):relocatio…

Python - Multiprocessing Error cannot start a process twice

I try to develop an algorithm using multiprocessing package in Python, i learn some tutorial from internet and try to develop an algorithm with this package. After looking around and try my hello world…

Printing unicode number of chars in a string (Python)

This should be simple, but I cant crack it. I have a string of Arabic symbols between u\u0600 - u\u06FF and u\uFB50 - u\uFEFF. For example غينيا واستمر العصبة ضرب قد. How do I pri…

Pandas report top-n in group and pivot

I am trying to summarise a dataframe by grouping along a single dimension d1 and reporting summary statistics for each element of d1. In particular I am interested in the top n (index and values) for …