Is there any way to create code object from its disassembly acquired with dis.dis
?
For example, I compiled some code using co = compile('print("lol")', '<string>', 'exec')
and then printed disassembly using dis.dis(co)
, and now I want to "compile" disassembly back to codeobject (since it holds all the same data and nothing is lost).
Amazingly, yes there is - sort of.
However there are a number of caveats you need to understand. The first caveat is that Python bytecode, and by extension assembly instructions, can change every release. The second caveat to understand is that simply the information emitted by dis.dis()
in text form is incomplete with respect to what the Python interpreter needs. So you'd need a way to somehow fill in the missing information.
I have written a bytecode assembler which converts a text file assembly similar to what you have above into a python bytecode.
In your example you have a code object rather than the full information needed to create a bytecode file, but the guts of xasm
of course creates the code objects before writing them out with the additional information needed in a bytecode file. This is done in function create_code()
of https://github.com/rocky/python-xasm/blob/master/xasm/assemble.py
To see the difference between what is in a code object and how that fits into a Python bytecode file, I'll use your example and then finish with how to create a bytecode file.
If I run your example in Python 3.6.10, I get:
1 0 LOAD_NAME 0 (print)2 LOAD_CONST 0 ('lol')4 CALL_FUNCTION 16 POP_TOP8 LOAD_CONST 1 (None)10 RETURN_VALUE
But if I put your Python code into a file, say foo.py
, byte compile it using py_compile.compile(source, bytecode, source)
and the use xdis's cross-version Python disassembler pydisasm
I get:
# pydisasm version 4.2.4# Python bytecode 3.6 (3379)# Disassembled from Python 3.6.10 (default, Jan 23 2020, 16:43:38) # [GCC 7.4.0]# Timestamp in code: 1586703495 (2020-04-12 10:58:15)# Source code size mod 2**32: 13 bytes# Method Name: <module># Filename: foo.py# Argument count: 0# Kw-only arguments: 0# Number of locals: 0# Stack size: 2# Flags: 0x00000040 (NOFREE)# First Line: 1# Constants:# 0: 'lol'# 1: None# Names:# 0: print1: 0 LOAD_NAME 0 (print)2 LOAD_CONST 0 ('lol')4 CALL_FUNCTION 16 POP_TOP8 LOAD_CONST 1 (None)10 RETURN_VALUE
Notice that in a bytecode file there is a bit of additional information that is not in strictly the code object:
- which bytecode is being used, (3.6 with magic number 3379),
- a timestamp of when the code was created,
- a size (mod 2**32) of the source code,
- a method name,
- a filename,
- parameters to the code,
- method flags, and
- names of various sorts: constants, variables.
Now let's put of this to a file like foo2.pyasm
. To write that into a bytecode file simply run pyc-xasm
:
$ pyc-xasm foo2.pyasmWrote foo2.pyc$ python foo2.pyclol
I gave a demonstration of all of this in my 2018 lighting talk at PyColumbia 2018
I should note that until the next release of xasm
and xdis
, Python 3.7 and above don't work, but 3.6 and earlier do.