import threading
import Queue
import urllib2
import timeclass ThreadURL(threading.Thread):def __init__(self, queue):threading.Thread.__init__(self)self.queue = queuedef run(self):while True:host = self.queue.get()sock = urllib2.urlopen(host)data = sock.read()self.queue.task_done()hosts = ['http://www.google.com', 'http://www.yahoo.com', 'http://www.facebook.com', 'http://stackoverflow.com']
start = time.time()def main():queue = Queue.Queue()for i in range(len(hosts)):t = ThreadURL(queue)t.start()for host in hosts:queue.put(host)queue.join()if __name__ == '__main__':main()print 'Elapsed time: {0}'.format(time.time() - start)
I've been trying to get my head around how to perform Threading and after a few tutorials, I've come up with the above.
What it's supposed to do is:
- Initialiase the queue
- Create my Thread pool and then queue up the list of hosts
- My ThreadURL class should then begin work once a host is in the queue and read the website data
- The program should finish
What I want to know first off is, am I doing this correctly? Is this the best way to handle threads?
Secondly, my program fails to exit. It prints out the Elapsed time
line and then hangs there. I have to kill my terminal for it to go away. I'm assuming this is due to my incorrect use of queue.join()
?