sklearn.metrics.roc_curve only shows 5 fprs, tprs, thresholds [closed]

2024/11/20 16:37:43

i have length 520 of array and metrics.roc_curve shows only a few fpr,tpr,threshold

these are some values of my score array

[... 4.6719894  5.3444934  2.575739   3.5660675  3.4357991  4.195427
4.120169   5.021058   5.308503   5.3124313  4.8253884  4.7469654
5.0011086  5.170149   4.5555115  4.4109273  4.6183085  4.356304
4.413242   4.1186514  5.0573816  4.646429   5.063631   4.363433
5.431669   6.1605806  6.1510544  4.8733225  6.0209446  6.5198536
5.1457767  1.3887328  1.3165888  1.143339   1.717379   1.6670974
1.1816382  1.2497046  1.035109   1.4904765  1.195155   1.2590547
1.0998954  1.6484532  1.5722921  1.2841778  1.1058662  1.3368237
1.3262213  1.215088   1.4224783  1.046008   1.262415   1.2319984
1.2202312  1.1610713  1.2327379  1.1951761  1.8699458  0.98760885
1.6670336  1.5051543  1.2339936  1.5215651  1.534271   1.1805111
1.1587876  1.0894692  1.1936147  1.3278677  1.2409594  1.0499009... ]

And i got only these results

fpr [0.         0.         0.         0.00204499 0.00204499 1.        ] 
tpr [0.         0.03225806 0.96774194 0.96774194 1.         1.        ] 
threshold [7.5198536 6.5198536 3.4357991 2.5991373 2.575739  0.8769072]

what is the reason of this ?

Answer

This might depend on the default value of the parameter drop_intermediate (default to true) of roc_curve(), which is meant for dropping suboptimal thresholds, doc here. You might prevent such behaviour by passing drop_intermediate=False, instead.

Here's an example:

import numpy as np
try:from sklearn.datasets import fetch_openmlmnist = fetch_openml('mnist_784', version=1, cache=True)   mnist["target"] = mnist["target"].astype(np.int8)
except ImportError:from sklearn.datasets import fetch_mldata mnist = fetch_mldata('MNIST original')from sklearn.linear_model import SGDClassifier
from sklearn.model_selection import cross_val_predictX, y = mnist["data"], mnist["target"]
X_train, X_test, y_train, y_test = X[:60000], X[60000:], y[:60000], y[60000:]
shuffle_index = np.random.permutation(60000)
X_train, y_train = X_train[shuffle_index], y_train[shuffle_index]y_train_5 = (y_train == 5)
y_test_5 = (y_test == 5)sdg_clf = SGDClassifier(random_state=42, verbose=0)
sdg_clf.fit(X_train, y_train_5)y_scores = cross_val_predict(sdg_clf, X_train, y_train_5, cv=3, method='decision_function')# ROC Curvesfrom sklearn.metrics import roc_curvefpr, tpr, thresholds = roc_curve(y_train_5, y_scores)len(thresholds), len(fpr), len(tpr)
# (3472, 3472, 3472)# for roc curves, differently than for precision/recall curves, the length of thresholds and the other outputs do depend on drop_intermediate option, meant for dropping suboptimal thresholdsfpr_, tpr_, thrs = roc_curve(y_train_5, y_scores, drop_intermediate=False)
len(fpr_), len(tpr_), len(thrs)
# (60001, 60001, 60001)
https://en.xdnf.cn/q/119811.html

Related Q&A

I can not transform a file to a dictionary in python [duplicate]

This question already has answers here:ValueError: need more than 1 value to unpack python(4 answers)Closed 5 years ago.I am trying to transform a file to dictionary but having error.def txt_to_dict():…

Loan payment calculation

I am learning Python and am stuck. I am trying to find the loan payment amount. I currently have:def myMonthlyPayment(Principal, annual_r, n):years = nr = ( annual_r / 100 ) / 12MonthlyPayment = (Princ…

How can I implement this model?

Problem statement I have 3 classes (A, B, and C). I have 6 features: train_x = [[ 6.442 6.338 7.027 8.789 10.009 12.566][ 6.338 7.027 5.338 10.009 8.122 11.217][ 7.027 5.338 5.335 8.122 5.537…

How do I change a variable inside a variable?

Heres my code :hp1 = 100 health1 = you have, hp1hp1 = hp1 - 50 health1print hp1 print health1This is what it prints :50 (you have, 100)Why doesnt the hp1 change inside the health?

Why do I get NameError: name ... is not defined in python module?

filename:recom.py# Returns a distance-based similarity score for person1 and person2 def sim_distance(prefs,person1,person2): # Get the list of shared_itemssi={}for item in prefs[person1]:if item in pr…

How to form boxes from nearly touching lines

I want to detect corners from a image with boxes, although i created the chessboard edge lines with the EDlines algorithm. Now, I have some problems to join them to create perfect boxes. Could you help…

How to change app.py variable with HTML button?

How do I add or subtract 1 from the variable num ( in flask route) with an HTML button? So when I click the button it change the var to 1 and refresh the page to show the new value @app.route(/) def n…

How to check if time is in the range between two days?

I found some nice examples to check, if a time is in a specific range, like this one:now_time = datetime.datetime.now().time() start = datetime.time(17, 30) end = datetime.time(4, 00) if start <=…

Removing Duplicate Domain URLs From the Text File Using Bash

Text file https://www.google.com/1/ https://www.google.com/2/ https://www.google.com https://www.bing.com https://www.bing.com/2/ https://www.bing.com/3/Expected Output: https://www.google.com/1/ https…

How can I create a race circuit using Cubic Spline?

My problem is im using Cubic Spline but i get this error trying to graph a race circuit raise ValueError("x must be strictly increasing sequence.") ValueError: x must be strictly increasing s…