Using Yuri Nudelman's code with the custom _import
definition to specify modules to restrict serves as a good base but when calling functions within said user_code
naturally due to having to whitelist everything is there any way to permit other user defined functions to be called? Open to other sandboxing solutions although Jupyter didn't seem straight-forward to embed within a web interface.
from RestrictedPython import safe_builtins, compile_restricted
from RestrictedPython.Eval import default_guarded_getitemdef _import(name, globals=None, locals=None, fromlist=(), level=0):safe_modules = ["math"]if name in safe_modules:globals[name] = __import__(name, globals, locals, fromlist, level)else:raise Exception("Don't you even think about it {0}".format(name))safe_builtins['__import__'] = _import # Must be a part of builtinsdef execute_user_code(user_code, user_func, *args, **kwargs):""" Executed user code in restricted envArgs:user_code(str) - String containing the unsafe codeuser_func(str) - Function inside user_code to execute and return value*args, **kwargs - arguments passed to the user functionReturn:Return value of the user_func"""def _apply(f, *a, **kw):return f(*a, **kw)try:# This is the variables we allow user code to see. @result will contain return value.restricted_locals = { "result": None,"args": args,"kwargs": kwargs,} # If you want the user to be able to use some of your functions inside his code,# you should add this function to this dictionary.# By default many standard actions are disabled. Here I add _apply_ to be able to access# args and kwargs and _getitem_ to be able to use arrays. Just think before you add# something else. I am not saying you shouldn't do it. You should understand what you# are doing thats all.restricted_globals = { "__builtins__": safe_builtins,"_getitem_": default_guarded_getitem,"_apply_": _apply,} # Add another line to user code that executes @user_funcuser_code += "\nresult = {0}(*args, **kwargs)".format(user_func)# Compile the user codebyte_code = compile_restricted(user_code, filename="<user_code>", mode="exec")# Run itexec(byte_code, restricted_globals, restricted_locals)# User code has modified result inside restricted_locals. Return it.return restricted_locals["result"]except SyntaxError as e:# Do whaever you want if the user has code that does not compileraiseexcept Exception as e:# The code did something that is not allowed. Add some nasty punishment to the user here.raisei_example = """
import mathdef foo():return 7def myceil(x):return math.ceil(x)+foo()
"""
print(execute_user_code(i_example, "myceil", 1.5))
Running this returns 'foo' is not defined
First of all, the replacement for the __import__
built-in is implemented incorrectly. That built-in is supposed to return the imported module, not mutate the globals to include it:
Python 3.9.12 (main, Mar 24 2022, 13:02:21)
[GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> __import__('math')
<module 'math' (built-in)>
>>> math
Traceback (most recent call last):File "<stdin>", line 1, in <module>
NameError: name 'math' is not defined
A better way to reimplement __import__
would be this:
_SAFE_MODULES = frozenset(("math",))def _safe_import(name, *args, **kwargs):if name not in _SAFE_MODULES:raise Exception(f"Don't you even think about {name!r}")return __import__(name, *args, **kwargs)
The fact that you mutated globals in your original implementation was partially masking the primary bug. Namely: name assignments within restricted code (function definitions, variable assignments and imports) mutate the locals dict, but name look-ups are by default done as global look-ups, bypassing the locals entirely. You can see this by disassembling the restricted bytecode using __import__('dis').dis(byte_code)
:
2 0 LOAD_CONST 0 (0)2 LOAD_CONST 1 (None)4 IMPORT_NAME 0 (math)6 STORE_NAME 0 (math)4 8 LOAD_CONST 2 (<code object foo at 0x7fbef4eef3a0, file "<user_code>", line 4>)10 LOAD_CONST 3 ('foo')12 MAKE_FUNCTION 014 STORE_NAME 1 (foo)7 16 LOAD_CONST 4 (<code object myceil at 0x7fbef4eef660, file "<user_code>", line 7>)18 LOAD_CONST 5 ('myceil')20 MAKE_FUNCTION 022 STORE_NAME 2 (myceil)24 LOAD_CONST 1 (None)26 RETURN_VALUEDisassembly of <code object foo at 0x7fbef4eef3a0, file "<user_code>", line 4>:5 0 LOAD_CONST 1 (7)2 RETURN_VALUEDisassembly of <code object myceil at 0x7fbef4eef660, file "<user_code>", line 7>:8 0 LOAD_GLOBAL 0 (_getattr_)2 LOAD_GLOBAL 1 (math)4 LOAD_CONST 1 ('ceil')6 CALL_FUNCTION 28 LOAD_FAST 0 (x)10 CALL_FUNCTION 112 LOAD_GLOBAL 2 (foo)14 CALL_FUNCTION 016 BINARY_ADD18 RETURN_VALUE
The documentation for exec
explains (emphasis mine):
If only globals is provided, it must be a dictionary (and not a subclass of dictionary), which will be used for both the global and the local variables. If globals and locals are given, they are used for the global and local variables, respectively. If provided, locals can be any mapping object. Remember that at the module level, globals and locals are the same dictionary. If exec gets two separate objects as globals and locals, the code will be executed as if it were embedded in a class definition.
This makes separate mappings for locals and globals completely spurious. We can therefore simply get rid of the locals dict, and put everything in globals. The entire code should look something like this:
from RestrictedPython import safe_builtins, compile_restricted_SAFE_MODULES = frozenset(("math",))def _safe_import(name, *args, **kwargs):if name not in _SAFE_MODULES:raise Exception(f"Don't you even think about {name!r}")return __import__(name, *args, **kwargs)def execute_user_code(user_code, user_func, *args, **kwargs):my_globals = {"__builtins__": {**safe_builtins,"__import__": _safe_import,},}try:byte_code = compile_restricted(user_code, filename="<user_code>", mode="exec")except SyntaxError:# syntax error in the sandboxed coderaisetry:exec(byte_code, my_globals)return my_globals[user_func](*args, **kwargs)except BaseException:# runtime error (probably) in the sandboxed coderaise
Above I also managed to fix a couple of tangential issues:
- Instead of injecting the function call into the compiled snippet, I look up the function in the globals dict directly. This avoids a potential code injection vector if
user_func
happens to come from an untrusted source, and avoids having to inject args
, kwargs
and result
into the sandbox, which would enable sandboxed code to clobber it.
- I avoid mutating the
safe_builtins
object provided by the RestrictedPython
module. Otherwise, if any other code within your program happens to be using RestrictedPython
, it might have been affected.
- I split the exception handling between the two steps: compilation and execution. This minimises the probability that bugs in the sandbox code will be misattributed to the sandboxed code.
- I changed the caught runtime exception type to
BaseException
, to also catch cases when sandboxed code attempts to raise KeyboardInterrupt
or SystemExit
(which do not derive from Exception
, but only BaseException
).
- I also removed references to
_getitem_
and _apply_
, which don’t seem to be used for anything. If they turn out to be necessary after all, you may restore them.
(Note, however, that this still does not protect against DoS via infinite loops within the sandbox.)