I have this following really hack code which removes circular references from any kind of data structure built out of dict
, tuple
and list
objects.
import astdef remove_circular_refs(o):return ast.literal_eval(str(o).replace("{...}", 'None'))
But I don't like how hacky it is. Can this be done without turning the data structure into a string representation?
Here is an example structure to test with:
doc1 = {"key": "value","type": "test1",
}
doc1["self"] = doc1
doc = {'tags': 'Stackoverflow python question','type': 'Stackoverflow python question',
}
doc2 = {'value': 2,'id': 2,
}
remove_circular_refs(doc)
remove_circular_refs(doc1)
remove_circular_refs(doc2)
Don't use string conversion, no. Just detect the reference by traversing the data structure:
def remove_circular_refs(ob, _seen=None):if _seen is None:_seen = set()if id(ob) in _seen:# circular reference, remove it.return None_seen.add(id(ob))res = obif isinstance(ob, dict):res = {remove_circular_refs(k, _seen): remove_circular_refs(v, _seen)for k, v in ob.items()}elif isinstance(ob, (list, tuple, set, frozenset)):res = type(ob)(remove_circular_refs(v, _seen) for v in ob)# remove id again; only *nested* references count_seen.remove(id(ob))return res
This covers dict
, list
, tuple
, set
and frozenset
objects; it memoises the id()
of each object seen, and when it is seen again it is replaced with None
.
Demo:
>>> doc1 = {
... "key": "value",
... "type": "test1",
... }
>>> doc1["self"] = doc1
>>> doc1
{'key': 'value', 'type': 'test1', 'self': {...}}
>>> remove_circular_refs(doc1)
{'key': 'value', 'type': 'test1', 'self': None}
>>> doc2 = {
... 'foo': [],
... }
>>> doc2['foo'].append((doc2,))
>>> doc2
{'foo': [({...},)]}
>>> remove_circular_refs(doc2)
{'foo': [(None,)]}
>>> doc3 = {
... 'foo': 'string 1', 'bar': 'string 1',
... 'ham': 1, 'spam': 1
... }
>>> remove_circular_refs(doc3)
{'foo': 'string 1', 'bar': 'string 1', 'ham': 1, 'spam': 1}
The last test, for doc3
, contains shared references; both 'string 1'
and 1
exist just once in memory, with the dictionary containing multiple references to those objects.