Question 1

TL;DR
How can I get superkeys to be autovivified in a Python dict when assigning values to subkeys, without also getting them autovivified when checking for subkeys?

Background: Normally in Python, setting values in a nested dictionary requires manually ensuring that higher-level keys exist before assigning to their sub-keys. That is,

my_dict[1][2] = 3

will not reliably work as intended without first doing something like

if 1 not in my_dict:my_dict[1] = {}

Now, it is possible to set up a kind of autovivification by making my_dict an instance of a class that overrides __missing__, as shown e.g. in https://stackoverflow.com/a/19829714/6670909.

Question: However, that solution silently autovivifies higher-level keys if you check for the existence of a sub-key in such a nested dict. That leads to the following unfortunateness:

>>> vd = Vividict()
>>> 1 in vd
False
>>> 2 in vd[1]
False
>>> 1 in vd
True

How can I avoid that misleading result? In Perl, by the way, I can get the desired behavior by doing

no autovivification qw/exists/;

And basically I'd like to replicate that behavior in Python if possible.

Question 2

This is not an easy problem to solve, because in your example:

my_dict[1][2] = 3

my_dict[1] results in a __getitem__ call on the dictionary. There is no way at that point to know that an assignment is being made. Only the last [] in the sequence is a __setitem__ call, and it can't succeed unless mydict[1] exists, because otherwise, what object are you assigning into?

So don't use autovivication. You can use setdefault() instead, with a regular dict.

my_dict.setdefault(1, {})[2] = 3

Now that's not exactly pretty, especially when you are nesting more deeply, so you might write a helper method:

class MyDict(dict):def nest(self, keys, value):for key in keys[:-1]:self = self.setdefault(key, {})self[keys[-1]] = valuemy_dict = MyDict()my_dict.nest((1, 2), 3)       # my_dict[1][2] = 3

But even better is to wrap this into a new __setitem__ that takes all the indexes at once, instead of requiring the intermediate __getitem__ calls that induce the autovivication. This way, we know from the beginning that we're doing an assignment and can proceed without relying on autovivication.

class MyDict(dict):def __setitem__(self, keys, value):if not isinstance(keys, tuple):return dict.__setitem__(self, keys, value)for key in keys[:-1]:self = self.setdefault(key, {})dict.__setitem__(self, keys[-1], value)my_dict = MyDict()
my_dict[1, 2] = 3

For consistency, you could also provide __getitem__ that accepts keys in a tuple as follows:

def __getitem__(self, keys):if not isinstance(keys, tuple):return dict.__getitem__(self, keys)for key in keys:self = dict.__getitem__(self, key)return self

The only downside I can think of is that we can't use tuples as dictionary keys as easily: we have to write that as, e.g. my_dict[(1, 2),].

How to implement autovivification for nested dictionary ONLY when assigning values?

Related Q&A

How can I iterate across the photos on my connected iPhone from Windows 7 in Python?

Why doesnt the python slice syntax wrap around from negative to positive indices?

How do I modify a generator in Python?

Reading a website with asyncore

How to select specific the cipher while sending request via python request module

Different classes made by type with the same name in Python?

Installing python server for emacs-jedi

Multi-feature causal CNN - Keras implementation

Adding a join to an SQL Alchemy expression that already has a select_from()

How should I move blobs from BlobStore over to Google Cloud Storage?