Understanding an issue with the namedtuple typename and pickle in Python

2024/9/8 10:50:51

Earlier today I was having trouble trying to pickle a namedtuple instance. As a sanity check, I tried running some code that was posted in another answer. Here it is, simplified a little more:

from collections import namedtuple
import pickleP = namedtuple("P", "one two three four")def pickle_test():abe = P("abraham", "lincoln", "vampire", "hunter")f = open('abe.pickle', 'w')pickle.dump(abe, f)f.close()pickle_test()

I then changed two lines of this to use my named tuple:

from collections import namedtuple
import pickleP = namedtuple("my_typename", "A B C")def pickle_test():abe = P("ONE", "TWO", "THREE")f = open('abe.pickle', 'w')pickle.dump(abe, f)f.close()pickle_test()

However this gave me the error

  File "/path/to/anaconda/lib/python2.7/pickle.py", line 748, in save_global(obj, module, name))
pickle.PicklingError: Can't pickle <class '__main__.my_typename'>: it's not found as __main__.my_typename

i.e. the Pickle module is looking for my_typename. I changed the line P = namedtuple("my_typename", "A B C") to P = namedtuple("P", "A B C") and it worked.

I looked at the source of namedtuple.py and at the end we have something that looks relevant, but I don't fully understand what is happening:

# For pickling to work, the __module__ variable needs to be set to the frame
# where the named tuple is created.  Bypass this step in enviroments where
# sys._getframe is not defined (Jython for example) or sys._getframe is not
# defined for arguments greater than 0 (IronPython).
try:result.__module__ = _sys._getframe(1).f_globals.get('__name__', '__main__')
except (AttributeError, ValueError):passreturn result

So my question is what exactly is going on? Why does the typename argument need to match the name of the factory for this to work?

Answer

In the section titled What can be pickled and unpickled? of the Python documentation it indicates that only "classes that are defined at the top level of a module" can be pickled. However namedtuple() is a factory function which is effectively defining a class (my_typename(tuple) in your second example), however it's not assigning the manufactured type to a variable named my_typename at the top level of the module.

This is because pickle saves only the “fully qualified” name of such things, not their code, and they must be importable from the module they're in using this name in order to be able to unpickled later (hence the requirement that the module must contain the named object at the top level).

This can be illustrated by seeing one workaround for the problem—which would be to change one line of the code so that the type named my_typename is defined at the top level:

P = my_typename = namedtuple("my_typename", "A B C")

Alternatively, you could just give the namedtuple the name "P" instead of "my_typename":

P = namedtuple("P", "A B C")

As for what that namedtuple.py source code you were looking at does: It's trying to determine the name of module the caller (the creator of the namedtuple) is in because the author knows that pickle might try to use it to import the definition to do unpickling and that folks commonly assign the result to variable with the same name that they passed to the factory function (but you didn't in the second example).

https://en.xdnf.cn/q/72775.html

Related Q&A

SQLAlchemy Columns result processing

Im working with a IBM DB2 database using ibm_db2 driver and sqlalchemy. My model is:class User(Model):id = Column(UID, Integer, primary_key=True)user = Column(USER, String(20))password …

How can I access relative paths in Python 2.7 when imported by different modules

The Goal: Access / Write to the same temp files when using a common utility function called from various python modules.Background: I am using the python Unittest module to run sets of custom tests tha…

Emacs: Inferior-mode python-shell appears lagged

Im a Python(3.1.2)/emacs(23.2) newbie teaching myself tkinter using the pythonware tutorial found here. Relevant code is pasted below the question.Question: when I click the Hello button (which should …

AttributeError: module spacy has no attribute load

import spacy nlp = spacy.load(en_core_web_sm)**Error:** Traceback (most recent call last):File "C:\Users\PavanKumar\.spyder-py3\ExcelML.py", line 27, in <module>nlp = spacy.load(en_core…

No module named Win32com.client error when using the pyttsx package

Today, while surfing on Quora, I came across answers on amazing things that python can do. I tried to use the pyttsx Text to Speech Convertor and that gave me an No module named Win32com.client error.T…

Python: How to create and use a custom logger in python use logging module?

I am trying to create a custom logger as in the code below. However, no matter what level I pass to the function, logger only prints warning messages. For example even if I set the argument level = log…

Flask-Mail - Sending email asynchronously, based on Flask-Cookiecutter

My flask project is based on Flask-Cookiecutter and I need to send emails asynchronously.Function for sending email was configured by Miguel’s Tutorial and sending synchronously works fine, but i don’…

Change text_factory in Django/sqlite

I have a django project that uses a sqlite database that can be written to by an external tool. The text is supposed to be UTF-8, but in some cases there will be errors in the encoding. The text is fro…

Shuffle patches in image batch

I am trying to create a transform that shuffles the patches of each image in a batch. I aim to use it in the same manner as the rest of the transformations in torchvision: trans = transforms.Compose([t…

Python looping: idiomatically comparing successive items in a list

I need to loop over a list of objects, comparing them like this: 0 vs. 1, 1 vs. 2, 2 vs. 3, etc. (Im using pysvn to extract a list of diffs.) I wound up just looping over an index, but I keep wondering…