Randomized stratified k-fold cross-validation in scikit-learn?

2024/9/21 10:44:04

Is there any built-in way to get scikit-learn to perform shuffled stratified k-fold cross-validation? This is one of the most common CV methods, and I am surprised I couldn't find a built-in method to do this.

I saw that cross_validation.KFold() has a shuffling flag, but it is not stratified. Unfortunately cross_validation.StratifiedKFold() does not have such an option, and cross_validation.StratifiedShuffleSplit() does not produce disjoint folds.

Am I missing something? Is this planned?

(obviously I can implement this by myself)

Answer

The shuffling flag for cross_validation.StratifiedKFold has been introduced in the current version 0.15:

http://scikit-learn.org/0.15/modules/generated/sklearn.cross_validation.StratifiedKFold.html

This can be found in the Changelog:

http://scikit-learn.org/stable/whats_new.html#new-features

Shuffle option for cross_validation.StratifiedKFold. By JeffreyBlackburne.

https://en.xdnf.cn/q/72069.html

Related Q&A

Find all paths through a tree (nested dicts) from top to bottom

EDIT: See below for a suggested answer and how its not quite right yet.There are many similar questions to this one on Stack Overflow, but none exactly like it in Python. Im a programming novice, so pl…

How to show process state (blocking, non-blocking) in Linux

Is there a way to query the state of processes in a Linux process table to be able to demonstrate if a process is running or blocked at the time the query is executed? My goal is to do this from outsi…

Removing quotation marks from list items

I am running this program:f = open( "animals.txt", "r") g = f.read() g1 = g.split(,) #turning the file into list print g1And I want this to come out:[ELDEN, DORSEY, DARELL, BRODERIC…

Handle multiple questions for Telegram bot in python

Im programming a telegram bot in Python using the Telegram bot API. Im facing the problem of managing questions that need an answer of the user. The problem arises when the program is waiting for an an…

Which GTK+ elements support which CSS properties?

While applying my own CSS to my GTK+ application, I noticed, that some elements ignore some CSS properties and others ignore others or dont ignore them, which leads me to search for an overview of whic…

Self import of subpackages or not?

Suppose you have the following b b/__init__.py b/c b/c/__init__.py b/c/d b/c/d/__init__.pyIn some python packages, if you import b, you only get the symbols defined in b. To access b.c, you have to exp…

why is my text not aligning properly in wxPython?

Im using wxPython to build a GUI and Im trying to align some text but its not working at all. Im trying align three different static text items in three places (right aligned, center aligned, and left …

Python subprocess check_output decoding specials characters

Im having some issues with python encoding. When I try to execute this:subprocess.check_output("ipconfig", shell=True)it gives me an output with special characters in it, like:"Statut du…

Python - SystemError: NULL result without error in PyObject call

The story: Im trying to interface from C to Python in order to use the faster computational speed of C for an existing Python code. I already had some success, also with passing NumPy arrays - but now …

Fix jumping of multiple progress bars (tqdm) in python multiprocessing

I want to parallelize a task (progresser()) for a range of input parameters (L). The progress of each task should be monitored by an individual progress bar in the terminal. Im using the tqdm package f…