Convert python disassembly from dis.dis back to codeobject

2024/9/20 17:42:00

Is there any way to create code object from its disassembly acquired with dis.dis?

For example, I compiled some code using co = compile('print("lol")', '<string>', 'exec') and then printed disassembly using dis.dis(co), and now I want to "compile" disassembly back to codeobject (since it holds all the same data and nothing is lost).

Answer

Amazingly, yes there is - sort of.

However there are a number of caveats you need to understand. The first caveat is that Python bytecode, and by extension assembly instructions, can change every release. The second caveat to understand is that simply the information emitted by dis.dis() in text form is incomplete with respect to what the Python interpreter needs. So you'd need a way to somehow fill in the missing information.

I have written a bytecode assembler which converts a text file assembly similar to what you have above into a python bytecode.

In your example you have a code object rather than the full information needed to create a bytecode file, but the guts of xasm of course creates the code objects before writing them out with the additional information needed in a bytecode file. This is done in function create_code() of https://github.com/rocky/python-xasm/blob/master/xasm/assemble.py

To see the difference between what is in a code object and how that fits into a Python bytecode file, I'll use your example and then finish with how to create a bytecode file.

If I run your example in Python 3.6.10, I get:

  1           0 LOAD_NAME                0 (print)2 LOAD_CONST               0 ('lol')4 CALL_FUNCTION            16 POP_TOP8 LOAD_CONST               1 (None)10 RETURN_VALUE

But if I put your Python code into a file, say foo.py, byte compile it using py_compile.compile(source, bytecode, source) and the use xdis's cross-version Python disassembler pydisasm I get:

  # pydisasm version 4.2.4# Python bytecode 3.6 (3379)# Disassembled from Python 3.6.10 (default, Jan 23 2020, 16:43:38) # [GCC 7.4.0]# Timestamp in code: 1586703495 (2020-04-12 10:58:15)# Source code size mod 2**32: 13 bytes# Method Name:       <module># Filename:          foo.py# Argument count:    0# Kw-only arguments: 0# Number of locals:  0# Stack size:        2# Flags:             0x00000040 (NOFREE)# First Line:        1# Constants:#    0: 'lol'#    1: None# Names:#    0: print1:           0 LOAD_NAME                 0 (print)2 LOAD_CONST                0 ('lol')4 CALL_FUNCTION             16 POP_TOP8 LOAD_CONST                1 (None)10 RETURN_VALUE

Notice that in a bytecode file there is a bit of additional information that is not in strictly the code object:

  • which bytecode is being used, (3.6 with magic number 3379),
  • a timestamp of when the code was created,
  • a size (mod 2**32) of the source code,
  • a method name,
  • a filename,
  • parameters to the code,
  • method flags, and
  • names of various sorts: constants, variables.

Now let's put of this to a file like foo2.pyasm. To write that into a bytecode file simply run pyc-xasm:

  $ pyc-xasm foo2.pyasmWrote foo2.pyc$ python foo2.pyclol

I gave a demonstration of all of this in my 2018 lighting talk at PyColumbia 2018

I should note that until the next release of xasm and xdis, Python 3.7 and above don't work, but 3.6 and earlier do.

https://en.xdnf.cn/q/72322.html

Related Q&A

Loop over a tensor and apply function to each element

I want to loop over a tensor which contains a list of Int, and apply a function to each of the elements. In the function every element will get the value from a dict of python. I have tried the easy wa…

How to quickly get the last line from a .csv file over a network drive?

I store thousands of time series in .csv files on a network drive. Before I update the files, I first get the last line of the file to see the timestamp and then I update with data after that timestamp…

Force use of scientific style for basemap colorbar labels

String formatting can by used to specify scientific notation for matplotlib.basemap colorbar labels:cb = m.colorbar(cs, ax=ax1, format=%.4e)But then each label is scientifically notated with the base.I…

VS Code Doesnt Recognize Python Virtual Environment

Im using VS Code on a Mac to write Python code. Ive created a virtual environment named venv inside my project folder and opened VS Code in my project folder. I can see the venv folder in the Explore…

Why codecs.iterdecode() eats empty strings?

Why the following two decoding methods return different results?>>> import codecs >>> >>> data = [, , a, ] >>> list(codecs.iterdecode(data, utf-8)) [ua] >>>…

How to keep NaN in pivot table?

Looking to preserve NaN values when changing the shape of the dataframe.These two questions may be related:How to preserve NaN instead of filling with zeros in pivot table? How to make two NaN as NaN …

Using Pandas df.where on multiple columns produces unexpected NaN values

Given the DataFrameimport pandas as pddf = pd.DataFrame({transformed: [left, right, left, right],left_f: [1, 2, 3, 4],right_f: [10, 20, 30, 40],left_t: [-1, -2, -3, -4],right_t: [-10, -20, -30, -40], }…

Django star rating system and AJAX

I am trying to implement a star rating system on a Django site.Storing the ratings in my models is sorted, as is displaying the score on the page. But I want the users to be able to rate a page (from 1…

Create inheritance graphs/trees for Django templates

Is there any tool out there that would take a directory with a Django application, scan it for templates and draw/print/list a hierarchy of inheritance between templates?Seeing which blocks are being …

Python SVG converter creates empty file

I have some code below that is supposed to convert a SVG image to a PNG. It runs without errors but creates a PNG file that is blank instead of one with the same image as the original SVG. I did find t…