I'd like to select some points on a plot (e.g. from box_select
or lasso_select
) and retrieve them in a Jupyter notebook for further data exploration. How can I do that?
For instance, in the code below, how to export the selection from Bokeh to the notebook? If I need a Bokeh server, this is fine too (I saw in the docs that I could add "two-way communication" with a server but did not manage to adapt the example to reach my goal).
from random import random
from bokeh.io import output_notebook, show
from bokeh.plotting import figure
from bokeh.models.sources import ColumnDataSourceoutput_notebook()x = [random() for x in range(1000)]
y = [random() for y in range(1000)]s = ColumnDataSource(data=dict(x=x, y=y))
fig = figure(tools=['box_select', 'lasso_select', 'reset'])
fig.circle("x", "y", source=s, alpha=0.6)show(fig)
# Select on the plot
# Get selection in a ColumnDataSource, or index list, or pandas object, or etc.?
Notes
- I saw some related questions on SO, but most answers are for outdated versions of Bohek, 0.x or 1.x, I'm looking for an answer for v>=2.
- I am open for solutions with other visualization libraries like altair, etc.
To select some points on a plot and retrieve them in a Jupyter notebook, you can use a CustomJS callback.
Within the CustomJS callback javascript code, you can access the Jupyter notebook kernel using IPython.notebook.kernel
. Then, you can use kernal.execute(python_code)
to run Python code and (for example) export data from the javascript call to the Jupyter notebook.
So, a bokeh server is not necessary for two-way communication between the bokeh plot and the Jupyter notebook.
Below, I have extended your example code to include a CustomJS callback that triggers on a selection geometry event in the figure. Whenever a selection is made, the callback runs and exports the indices of the selected data points to a variable within the Jupyter notebook called selected_indices
.
To obtain a ColumnDataSource
that contains the selected data points, the selected_indices
tuple is looped through to create lists of the selected x and y values, which are then passed to a ColumnDataSource
constructor.
from random import random
from bokeh.io import output_notebook, show
from bokeh.plotting import figure
from bokeh.models.sources import ColumnDataSource
from bokeh.models.callbacks import CustomJSoutput_notebook()x = [random() for x in range(1000)]
y = [random() for y in range(1000)]s = ColumnDataSource(data=dict(x=x, y=y))fig = figure(tools=['box_select', 'lasso_select', 'reset'])
fig.circle("x", "y", source=s, alpha=0.6)# make a custom javascript callback that exports the indices of the selected points to the Jupyter notebook
callback = CustomJS(args=dict(s=s), code="""console.log('Running CustomJS callback now.');var indices = s.selected.indices;var kernel = IPython.notebook.kernel;kernel.execute("selected_indices = " + indices)""")# set the callback to run when a selection geometry event occurs in the figure
fig.js_on_event('selectiongeometry', callback)show(fig)
# make a selection using a selection tool # inspect the selected indices
selected_indices# use the indices to create lists of the selected values
x_selected, y_selected = [], []
for indice in selected_indices:x_val = s.data['x'][indice]y_val = s.data['y'][indice]x_selected.append(x_val)y_selected.append(y_val)# make a column data souce containing the selected values
selected = ColumnDataSource(data=dict(x=x_selected, y=y_selected))# inspect the selected data
selected.data