I arranged my Jupyter notebooks into: data.ipynb
, methods.ipynb
and results.ipynb
. How can I selectively import cells from data and methods notebooks for use in the results notebook?
I know of nbimporter
and ipynb
but neither of those offers selective import of variables. There is an option to import definitions - including variables that are uppercase - but this does not work for me as I would have to convert most of the variables in my notebooks to uppercase.
I would rather import everything except for two or three cells that take a long time to evaluate. Ideally, I would like to defer the execution of some assignments to the very moment I access them (lazy evaluation) - but I understand that it might be difficult to implement.
Here is the overview, in pseudocode (each line repesents a cell):
data.ipynb
:
raw_data = load_data()
dataset = munge(raw_data)
describe(dataset) # I want this line to be skipped at import
methods.ipynb
:
import data
method = lambda x: x * x
# showcase how the method works on a subset of the dataset
method(data.dataset[:5]) # I want this line to be skipped at import
results.ipynb
:
import data
import methods
result = methods.method(data.dataset)
describe(result)
The motivation is that my real data and methods notebooks:
- are way much longer and complicated, hence I want to use an import system
- there are only a couple of cells that take more than seconds to evaluate
also, the methods notebook cannot be replaced with methods.py
file. In fact, I have such a file which contains the implementation details of my method. The notebook is more of a place to specify default parameters, showcase how my method works and explain example results.
This question is essentially a combination of:
- How can I import from another ipython-notebook?, and
- Simple way to choose which cells to run in ipython notebook during run all
I read through answers to both and none satisfied my requirements.
In my answer below I present my solution that uses custom cell magics and monkey-patching. However, I would prefer a solution which allows specifying which cells/expressions to exclude/include not in the notebook of origin (e.g. data.ipynb
) but in the target one (e.g. in methods.ipynb
).
For example, it could use regular expressions:
# all variables starting with 'result' would be ignored
nbimporter.options['exclude'] = '^result.*'
or (even better) lazy evaluation:
# only `a` and `b` would be evaluated and imported
from data import a, b
All ideas will be appreciated!