How to implement autovivification for nested dictionary ONLY when assigning values?

2024/9/28 9:33:41

TL;DR
How can I get superkeys to be autovivified in a Python dict when assigning values to subkeys, without also getting them autovivified when checking for subkeys?

Background: Normally in Python, setting values in a nested dictionary requires manually ensuring that higher-level keys exist before assigning to their sub-keys. That is,

my_dict[1][2] = 3

will not reliably work as intended without first doing something like

if 1 not in my_dict:my_dict[1] = {}

Now, it is possible to set up a kind of autovivification by making my_dict an instance of a class that overrides __missing__, as shown e.g. in https://stackoverflow.com/a/19829714/6670909.

Question: However, that solution silently autovivifies higher-level keys if you check for the existence of a sub-key in such a nested dict. That leads to the following unfortunateness:

>>> vd = Vividict()
>>> 1 in vd
False
>>> 2 in vd[1]
False
>>> 1 in vd
True

How can I avoid that misleading result? In Perl, by the way, I can get the desired behavior by doing

no autovivification qw/exists/;

And basically I'd like to replicate that behavior in Python if possible.

Answer

This is not an easy problem to solve, because in your example:

my_dict[1][2] = 3

my_dict[1] results in a __getitem__ call on the dictionary. There is no way at that point to know that an assignment is being made. Only the last [] in the sequence is a __setitem__ call, and it can't succeed unless mydict[1] exists, because otherwise, what object are you assigning into?

So don't use autovivication. You can use setdefault() instead, with a regular dict.

my_dict.setdefault(1, {})[2] = 3

Now that's not exactly pretty, especially when you are nesting more deeply, so you might write a helper method:

class MyDict(dict):def nest(self, keys, value):for key in keys[:-1]:self = self.setdefault(key, {})self[keys[-1]] = valuemy_dict = MyDict()my_dict.nest((1, 2), 3)       # my_dict[1][2] = 3

But even better is to wrap this into a new __setitem__ that takes all the indexes at once, instead of requiring the intermediate __getitem__ calls that induce the autovivication. This way, we know from the beginning that we're doing an assignment and can proceed without relying on autovivication.

class MyDict(dict):def __setitem__(self, keys, value):if not isinstance(keys, tuple):return dict.__setitem__(self, keys, value)for key in keys[:-1]:self = self.setdefault(key, {})dict.__setitem__(self, keys[-1], value)my_dict = MyDict()
my_dict[1, 2] = 3

For consistency, you could also provide __getitem__ that accepts keys in a tuple as follows:

def __getitem__(self, keys):if not isinstance(keys, tuple):return dict.__getitem__(self, keys)for key in keys:self = dict.__getitem__(self, key)return self

The only downside I can think of is that we can't use tuples as dictionary keys as easily: we have to write that as, e.g. my_dict[(1, 2),].

https://en.xdnf.cn/q/71359.html

Related Q&A

How can I iterate across the photos on my connected iPhone from Windows 7 in Python?

When I connect my iPhone to my Windows 7 system, the Windows Explorer opens a Virtual Folder to the DCIM content. I can access the shell library interface via Pywin32 (218) as mentioned here: Can I use…

Why doesnt the python slice syntax wrap around from negative to positive indices?

I noticed, given l = [1,2,3], that l[-1:] returns [3] as expected, but that l[-1:0] returns [], very much unlike what I expected. I then tried [-1:1], which I expected to return [3,1], but it also retu…

How do I modify a generator in Python?

Is there a common interface in Python that I could derive from to modify behavior of a generator?For example, I want to modify an existing generator to insert some values in the stream and remove some…

Reading a website with asyncore

I would like to read a website asynchronously, which isnt possible with urllib as far as I know. Now I tried reading with with plain sockets, but HTTP is giving me hell. I run into all kind of funky en…

How to select specific the cipher while sending request via python request module

Usecase: I want to find out how many ciphers are supported by the hostname with python request module.I am not able to find a way to provide the cipher name to request module hook. Can anyone suggest …

Different classes made by type with the same name in Python?

I was playing around with metaclasses in Python and found something very curious. I can create two classes with the same name, but that are actually different objects. See:>>> def create_class…

Installing python server for emacs-jedi

I am trying to install Jedi for emacs using marmalade package manager by following instructions here -- http://tkf.github.io/emacs-jedi/latest/. The package manger installs Jedi along with its dependen…

Multi-feature causal CNN - Keras implementation

Im currently using a basic LSTM to make regression predictions and I would like to implement a causal CNN as it should be computationally more efficient.Im struggling to figure out how to reshape my cu…

Adding a join to an SQL Alchemy expression that already has a select_from()

Note: this is a question about SQL Alchemys expression language not the ORMSQL Alchemy is fine for adding WHERE or HAVING clauses to an existing query:q = select([bmt_gene.c.id]).select_from(bmt_gene) …

How should I move blobs from BlobStore over to Google Cloud Storage?

Our application has been running on App Engine using the Blobstore for years. We would like to move our video files over to Google Cloud Storage. What is the best practice for migrating large blobs f…