Question 1

Am working on a binary classification problem with 1000 rows and 15 features.

Currently am using Lime to explain the predictions of each instance.

I use the below code to generate explanations for full test dataframe

test_indx_list = X_test.index.tolist()
test_dict={}
for n in test_indx_list:exp = explainer.explain_instance(X_test.loc[n].values, model.predict_proba, num_features=5)a=exp.as_list()test_dict[n] = a

But this is not efficient. Is there any alternative approach to generate explanation/ get feature contributions quicker?

Question 2

From what the docs show, there isn't currently an option to do batch explain_instance, although there are plans for it. This should help a lot with speed on newer versions later on.

What seems to be the most appropriate change to get better speed is decreasing the number of samples used to learn the linear model.

explainer.explain_instance(... num_features=5, num_samples=2500)

The default value for num_samples is 5000, which can be much more than you need depending on your model, and is currently the argument that will most affect the speed of the explainer.

Another approach would be to try adding parallelization to the snippet. It's a more complex solution where you run multiple instances of the snippet at the same time, and gather the results at the end. For that, I leave a link, but really it's not something I can give a snippet right out of the box.

Efficient way to generate Lime explanations for full dataset

Related Q&A

how to handle javascript alerts in selenium using python

testing.postgresql command not found: initdb inside docker

Recommended approach for loading CouchDB design documents in Python?

Error when import matplotlib.pyplot as plt

Python NtQueryDirectoryFile (File information structure)

returning A DNS record in dnspython

IO completion port key confusion

PySpark reversing StringIndexer in nested array

Numba np.convolve really slow

Python: Retrieving only POP3 message text, no headers