- I fit a Logistic Regression Model and train the model based on training dataset using the following
import scikits as sklearn
from sklearn.linear_model import LogisticRegression
lr = LogisticRegression(C=0.1, penalty='l1')
model = lr.fit(training[:,0:-1], training[:,-1)
- I have a cross validation dataset which contains a labels associated in input matrix and can be accessed as
cv[:,-1]
- I run my cross validation dataset against the trained model which returns me the list of 0s and 1s based on prediction
cv_predict = model.predict(cv[:,0:-1])
Question
I want to calculate the precision and recall scores based on acutal labels and predicted labels. Is there a standard method to do it using numpy/scipy/scikits?
Thank you
Yes there are, see the documentation: http://scikit-learn.org/stable/modules/classes.html#classification-metrics
You should also have a look at the sklearn.metrics.classification_report
utility:
>>> from sklearn.metrics import classification_report
>>> from sklearn.linear_model import SGDClassifier
>>> from sklearn.datasets import load_digits>>> digits = load_digits()
>>> n_samples, n_features = digits.data.shape
>>> n_split = n_samples / 2>>> clf = SGDClassifier().fit(digits.data[:n_split], digits.target[:n_split])>>> predictions = clf.predict(digits.data[n_split:])
>>> expected = digits.target[n_split:]>>> print classification_report(expected, predictions)precision recall f1-score support0 0.90 0.98 0.93 881 0.81 0.69 0.75 912 0.94 0.98 0.96 863 0.94 0.85 0.89 914 0.90 0.93 0.91 925 0.92 0.92 0.92 916 0.92 0.97 0.94 917 1.00 0.85 0.92 898 0.71 0.89 0.79 889 0.89 0.83 0.86 92avg / total 0.89 0.89 0.89 899