I built a scraper (worker) launched XX times through multithreading (via Jupyter Notebook, python 2.7, anaconda). Script is of the following format, as described on python.org:
def worker():while True:item = q.get()do_work(item)q.task_done()q = Queue()
for i in range(num_worker_threads):t = Thread(target=worker)t.daemon = Truet.start()for item in source():q.put(item)q.join() # block until all tasks are done
When I run the script as is, there are no issues. Memory is released after script finishes.
However, I want to run the said script 20 times (batching of sort), so I turn the script mentioned into a function, and run the function using code below:
def multithreaded_script():my script #code from abovex = 0
while x<20:x +=1multithredaded_script()
memory builds up with each iteration, and eventually the system start writing it to disk.
Is there a way to clear out the memory after each run?
I tried:
- setting all the variables to None
- setting
sleep(30)
at end of each iteration (in case it takes time for ram to release)
and nothing seems to help. Any ideas on what else I can try to get the memory to clear out after each run within the While statement? If not, is there a better way to execute my script XX times, that would not eat up the ram?
Thank you in advance.