Remove circular references in dicts, lists, tuples

2024/10/5 19:09:56

I have this following really hack code which removes circular references from any kind of data structure built out of dict, tuple and list objects.

import astdef remove_circular_refs(o):return ast.literal_eval(str(o).replace("{...}", 'None'))

But I don't like how hacky it is. Can this be done without turning the data structure into a string representation?

Here is an example structure to test with:

doc1 = {"key": "value","type": "test1",
}
doc1["self"] = doc1
doc = {'tags': 'Stackoverflow python question','type': 'Stackoverflow python question',
}
doc2 = {'value': 2,'id': 2,
}
remove_circular_refs(doc)
remove_circular_refs(doc1)
remove_circular_refs(doc2)
Answer

Don't use string conversion, no. Just detect the reference by traversing the data structure:

def remove_circular_refs(ob, _seen=None):if _seen is None:_seen = set()if id(ob) in _seen:# circular reference, remove it.return None_seen.add(id(ob))res = obif isinstance(ob, dict):res = {remove_circular_refs(k, _seen): remove_circular_refs(v, _seen)for k, v in ob.items()}elif isinstance(ob, (list, tuple, set, frozenset)):res = type(ob)(remove_circular_refs(v, _seen) for v in ob)# remove id again; only *nested* references count_seen.remove(id(ob))return res

This covers dict, list, tuple, set and frozenset objects; it memoises the id() of each object seen, and when it is seen again it is replaced with None.

Demo:

>>> doc1 = {
...     "key": "value",
...     "type": "test1",
... }
>>> doc1["self"] = doc1
>>> doc1
{'key': 'value', 'type': 'test1', 'self': {...}}
>>> remove_circular_refs(doc1)
{'key': 'value', 'type': 'test1', 'self': None}
>>> doc2 = {
...     'foo': [],
... }
>>> doc2['foo'].append((doc2,))
>>> doc2
{'foo': [({...},)]}
>>> remove_circular_refs(doc2)
{'foo': [(None,)]}
>>> doc3 = {
...     'foo': 'string 1', 'bar': 'string 1',
...     'ham': 1, 'spam': 1
... }
>>> remove_circular_refs(doc3)
{'foo': 'string 1', 'bar': 'string 1', 'ham': 1, 'spam': 1}

The last test, for doc3, contains shared references; both 'string 1' and 1 exist just once in memory, with the dictionary containing multiple references to those objects.

https://en.xdnf.cn/q/70449.html

Related Q&A

how to change image format when uploading image in django?

When a user uploads an image from the Django admin panel, I want to change the image format to .webp. I have overridden the save method of the model. Webp file is generated in the media/banner folder b…

Write info about nodes to a CSV file on the controller (the local)

I have written an Ansible playbook that returns some information from various sources. One of the variables I am saving during a task is the number of records in a certain MySQL database table. I can p…

Python minimize function: passing additional arguments to constraint dictionary

I dont know how to pass additional arguments through the minimize function to the constraint dictionary. I can successfully pass additional arguments to the objective function.Documentation on minimiz…

PyQt5 triggering a paintEvent() with keyPressEvent()

I am trying to learn PyQt vector painting. Currently I am stuck in trying to pass information to paintEvent() method which I guess, should call other methods:I am trying to paint different numbers to a…

A python regex that matches the regional indicator character class

I am using python 2.7.10 on a Mac. Flags in emoji are indicated by a pair of Regional Indicator Symbols. I would like to write a python regex to insert spaces between a string of emoji flags.For exampl…

Importing modules from a sibling directory for use with py.test

I am having problems importing anything into my testing files that I intend to run with py.test.I have a project structure as follows:/ProjectName | |-- /Title | |-- file1.py | |-- file2.py | …

Uploading and processing a csv file in django using ModelForm

I am trying to upload and fetch the data from csv file uploaded by user. I am using the following code. This is my html form (upload_csv1.html):<form action="{% url myapp:upload_csv %}" me…

Plotting Multiple Lines in iPython/pandas Produces Multiple Plots

I am trying to get my head around matplotlibs state machine model, but I am running into an error when trying to plot multiple lines on a single plot. From what I understand, the following code should…

libclang: add compiler system include path (Python in Windows)

Following this question and Andrews suggestions, I am trying to have liblang add the compiler system include paths (in Windows) in order for my Python codeimport clang.cindexdef parse_decl(node):refere…

Pako not able to deflate gzip files generated in python

Im generating gzip files from python using the following code: (using python 3)file = gzip.open(output.json.gzip, wb)dataToWrite = json.dumps(data).encode(utf-8)file.write(dataToWrite)file.close()Howev…