python - dictionary iterator for pool map

2024/9/25 11:14:55

I am handling set of frozensets. I am trying to find minimal sets for each frozenset in the dictionary 'output'. I have 70k frozensets, so i am making chunk of this frozenset dictionary and parallelizing this task. When i try to pass this dictionary as an input to my function, only key is being sent and so i am getting error, can someone help me to find what's wrong in this.

output => {frozenset({'rfid', 'zone'}): 0, frozenset({'zone'}): 0, frozenset({'zone', 'time'}): 0}def reduce(prob,result,output):print(output)for k in output.keys():#Function to do somethingdef reducer(prob,result,output):print(output)p = Pool(4) #number of processes = number of CPUsfunc2 = partial(reduce,prob,result)reduced_values= p.map( func2,output,chunksize=4)p.close() # no more tasksp.join()  # wrap up current tasksreturn reduced_valuesif __name__ == '__main__':final = reducer(prob,result,output){frozenset({'rfid', 'zone'}): 0, frozenset({'zone'}): 0, frozenset({'zone', 'time'}): 0}
frozenset({'rfid', 'zone'}) 
Error : AttributeError: 'frozenset' object has no attribute 'keys'

Updated as requested

from multiprocessing import Pool
from functools import partial
import itertoolsoutput = {frozenset({'rfid', 'zone'}): 0, frozenset({'zone'}): 0, frozenset({'zone', 'time'}): 0}
prob = {'3': 0.3, '1': 0.15, '2': 0.5, '4': 0.05}
result = {'2': {frozenset({'time', 'zone'}), frozenset({'time', 'rfid'})}, '3': {frozenset({'time', 'rfid'}), frozenset({'rfid', 'zone'})}}def reduce(prob,result,output):print(output)for k in output.keys():for ky,values in result.items():if any(k>=l for l in values):output[k] += sum((j for i,j in prob.items() if i == ky))return outputdef reducer(prob,result,output):print(output)p = Pool(4) #number of processes = number of CPUsfunc2 = partial(reduce,prob,result)reduced_values= p.map( func2,output,chunksize=4)p.close() # no more tasksp.join()  # wrap up current tasksreturn reduced_valuesif __name__ == '__main__':final = reducer(prob,result,output){frozenset({'zone', 'rfid'}): 0, frozenset({'zone'}): 0, frozenset({'time', 'zone'}): 0}for k in output.keys():
AttributeError: 'frozenset' object has no attribute 'keys'
frozenset({'zone', 'rfid'})

Full error details from the console:

{frozenset({'zone', 'time'}): 0, frozenset({'zone', 'rfid'}): 0, frozenset({'zone'}): 0}
frozenset({'zone', 'time'})
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):File "F:\Python34\lib\multiprocessing\pool.py", line 119, in workerresult = (True, func(*args, **kwds))File "F:\Python34\lib\multiprocessing\pool.py", line 44, in mapstarreturn list(map(*args))File "C:\Users\Dell\workspace\key_mining\src\variable.py", line 16, in reducefor k in output.keys():
AttributeError: 'frozenset' object has no attribute 'keys'
"""The above exception was the direct cause of the following exception:Traceback (most recent call last):File "C:\***\variable.py", line 33, in <module>final = reducer(prob,result,output)File "C:\***\variable.py", line 27, in reducerreduced_values= p.map( func2,output,chunksize=4)File "F:\Python34\lib\multiprocessing\pool.py", line 260, in mapreturn self._map_async(func, iterable, mapstar, chunksize).get()File "F:\Python34\lib\multiprocessing\pool.py", line 599, in getraise self._value
AttributeError: 'frozenset' object has no attribute 'keys'
Answer

The problem is that you're passing a dict object to map. When map iterates over the items in output, it's doing this:

for key in output:  # When you iterate over a dictionary, you just get the keys.func2(key)

So each time func2 is called, all that's contained in output is a single key (a frozenset) from the dictionary.

Based on your comments above, it seems you want to pass the entire dictionary to func2, but if you do that, you're really not doing anything at all in parallel. I think maybe you think that doing

pool.map(func2, output, chunksize=4)

Will result in the output dictionary being split into four dictionaries, each chunk being passed to an instance of func2. But that's not what happens at all. Instead, each key from the dictionary is sent individually func2.

chunksize is just used to tell the pool how many elements of output to send to each child process via inter-process communication at a time. It's only used for internal purposes; no matter what chunksize you use, func2 will only be called with a single element of output.

If you want to actually pass chunks of the dict, you need to do something like this:

# Break the output dict into 4 lists of (key, value) pairs
items = list(output.items())
chunksize = 4
chunks = [items[i:i + chunksize ] for i in range(0, len(items), chunksize)]
reduced_values= p.map(func2, chunks)

That will pass a list of (key, value) tuples from the output dict to func2. Then, inside func2, you can turn the list back into a dict:

def reduce(prob,result,output):output = dict(item for item in output)  # Convert back to a dictprint(output)...
https://en.xdnf.cn/q/71583.html

Related Q&A

How to get SciPy.integrate.odeint to stop when path is closed?

edit: Its been five years, has SciPy.integrate.odeint learned to stop yet?The script below integrates magnetic field lines around closed paths and stops when it returns to original value within some t…

High-dimensional data structure in Python

What is best way to store and analyze high-dimensional date in python? I like Pandas DataFrame and Panel where I can easily manipulate the axis. Now I have a hyper-cube (dim >=4) of data. I have be…

How to access top five Google result links using Beautifulsoup

I want to access the top five(or any specified number) of links of results from Google. Through research, I found and modified the following code.import requests from bs4 import BeautifulSoup import re…

Logging in a Framework

Imagine there is a framework which provides a method called logutils.set_up() which sets up the logging according to some config.Setting up the logging should be done as early as possible since warning…

Working of the Earth Mover Loss method in Keras and input arguments data types

I have found a code for the Earth Mover Loss in Keras/Tensrflow. I want to compute the loss for the scores given to images but I can not do it until I get to know the working of the Earth Mover Loss gi…

Django Rest Framework writable nested serializer with multiple nested objects

Im trying to create a writable nested serializer. My parent model is Game and the nested models are Measurements. I am trying to post this data to my DRF application using AJAX. However, when try to po…

Django How to Serialize from ManyToManyField and List All

Im developing a mobile application backend with Django 1.9.1 I implemented the follower model and now I want to list all of the followers of a user but Im currently stuck to do that. I also use Django…

PyDrive and Google Drive - automate verification process?

Im trying to use PyDrive to upload files to Google Drive using a local Python script which I want to automate so it can run every day via a cron job. Ive stored the client OAuth ID and secret for the G…

Using rm * (wildcard) in envoy: No such file or directory

Im using Python and Envoy. I need to delete all files in a directory. Apart from some files, the directory is empty. In a terminal this would be:rm /tmp/my_silly_directory/*Common sense dictates that i…

cant import django model into celery task

i have the following task:from __future__ import absolute_importfrom myproject.celery import appfrom myapp.models import Entity@app.task def add(entity_id):entity = Entity.objects.get(pk=entity_id)retu…