Lowlevel introspection in python3?

2024/9/27 19:20:47

Is there some introspection method allowing to reliably obtain the underlying data structure of an object instance, that is unaffected by any customizations?

In Python 3 an object's low-level implementation can be deeply obscured: Attribute lookup can be customized, and even the __dict__ and __slots__ attributes may not give a full picture, as they are writeable. dir() is explicitly meant to show "interesting" attributes rather than actual attributes, and even the inspect module doesn't seem to provide such functionality.

Not a duplicate. This question has been flagged as duplicate of Is there a built-in function to print all the current properties and values of an object?. However, that other question only talks about the standard ways of introspecting classes, which here are explicitly listed as not reliable on a lower level.

As an example consider the following script with an intentionally obscured class.

import inspectactual_members = None  # <- For showing the actual contents later.class ObscuredClass:def __init__(self):global actual_membersactual_members = dict()self.__dict__ = actual_membersself.actual_field = "actual_value"def __getattribute__(self, name):if name == "__dict__":return { "fake_field": "fake value - shown in __dict__" }else:return "fake_value - shown in inspect.getmembers()"obj = ObscuredClass()
print(f"{actual_members          = }")
print(f"{dir(obj)                = }")
print(f"{obj.__dict__            = }")
print(f"{inspect.getmembers(obj) = }")

which produces the output

actual_members          = {'actual_field': 'actual_value'}
dir(obj)                = ['fake_field']
obj.__dict__            = {'fake_field': 'fake value - shown in __dict__'}
inspect.getmembers(obj) = [('fake_field', 'fake_value - shown in inspect.getmembers()')]
Answer

There's nothing completely general, particularly for objects implemented in C. Python types just don't store enough instance layout metadata for a general solution. That said, gc.get_referents is pretty reliable even in the face of really weird Python-level modifications, including deleted or shadowed slot descriptors and a deleted or shadowed __dict__ descriptor.

gc.get_referents will give all references an object reports to the garbage collection system. It won't tell you why an object had a particular reference, though - it won't tell you that one dict was __dict__ and one dict was an unrelated slot that happened to have a dict in it.

For example:

import gcclass Foo:__slots__ = ('__dict__', 'a', 'b')__dict__ = Nonedef __init__(self):self.x = 1self.a = 2self.b = 3x = Foo()
del Foo.a
del Foo.bprint(gc.get_referents(x))for name in '__dict__', 'x', 'a', 'b':try:print(name, object.__getattribute__(x, name))except AttributeError:print('object.__getattribute__ could not look up', name)

This prints

[2, 3, {'x': 1}, <class '__main__.Foo'>]
__dict__ None
x 1
object.__getattribute__ could not look up a
object.__getattribute__ could not look up b

gc.get_referents manages to retrieve the real instance dict and the a and b slots, even when the relevant descriptors are all missing. Unfortunately, it gives no information about the meaning of any references it retrieves.

object.__getattribute__ fails to retrieve the instance dict or the a or b slots. It does manage to find x, because it doesn't rely on the __dict__ descriptor to find the instance dict when retrieving other attributes, but you need to already know x is a name you should look for - object.__getattribute__ can't discover what names you should look for on this object.

https://en.xdnf.cn/q/71428.html

Related Q&A

Efficiently find indices of nearest points on non-rectangular 2D grid

I have an irregular (non-rectangular) lon/lat grid and a bunch of points in lon/lat coordinates, which should correspond to points on the grid (though they might be slightly off for numerical reasons).…

How to code a sequence to sequence RNN in keras?

I am trying to write a sequence to sequence RNN in keras. I coded this program using what I understood from the web. I first tokenized the text then converted the text into sequence and padded to form …

Error when installing psycopg2 on Windows 10

Collecting psycopg2Using cached psycopg2-2.6.1.tar.gzComplete output from command python setup.py egg_info:running egg_infocreating pip-egg-info\psycopg2.egg-infowriting pip-egg-info\psycopg2.egg-info\…

Speeding up Pandas apply function

For a relatively big Pandas DataFrame (a few 100k rows), Id like to create a series that is a result of an apply function. The problem is that the function is not very fast and I was hoping that it can…

Numpy repeat for 2d array

Given two arrays, say arr = array([10, 24, 24, 24, 1, 21, 1, 21, 0, 0], dtype=int32) rep = array([3, 2, 2, 0, 0, 0, 0, 0, 0, 0], dtype=int32)np.repeat(arr, rep) returns array([10, 10, 10, 24, 24, 2…

Python Linux route table lookup

I posted Python find first network hop about trying to find the first hop and the more I thought about it, the easier it seemed like it would be a process the routing table in python. Im not a program…

How to compare frequencies/sampling rates in pandas?

is there a way to say that 13Min is > 59S and <2H using the frequency notation in pandas?

Why do I get expected an indented block when I try to run my Python script? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.Closed 5 years ago.Edit the question to include desired behavior, a specific problem or error, and t…

python run command as normal user in a root script

I have a python script that is launched as root, I cant change it. I would like to know if its possible to exectute certain lines of this script (or all the script) as normal user (I dont need to be ro…

Compare values of two arrays in python

How can i check if item in b is in a and the found match item in a should not be use in the next matching? Currently this code will match both 2 in b.a = [3,2,5,4] b = [2,4,2]for i in b:if i in a:prin…