XGBoost CV and best iteration

2024/9/25 0:38:11

I am using XGBoost cv to find the optimal number of rounds for my model. I would be very grateful if someone could confirm (or refute), the optimal number of rounds is:

    estop = 40res = xgb.cv(params, dvisibletrain, num_boost_round=1000000000, nfold=5, early_stopping_rounds=estop, seed=SEED, stratified=True)best_nrounds = res.shape[0] - estopbest_nrounds = int(best_nrounds / 0.8)

i.e: the total number of rounds completed is res.shape[0], so to get the optimal number of rounds, we subtract the number of early stopping rounds.

Then, we scale up the number of rounds, based on the fraction used for validation. Is that correct?


Yep, it sounds correct if when you do best_nrounds = int(best_nrounds / 0.8) you consider that your validation set was 20% of your whole training data (another way of saying that you performed a 5-fold cross-validation).

The rule can then be generalized as:

n_folds = 5
best_nrounds = int((res.shape[0] - estop) / (1 - 1 / n_folds))

Or if you don't perform CV but a single validation:

validation_slice = 0.2
best_nrounds = int((res.shape[0] - estop) / (1 - validation_slice))

You can see an example of this rule being applied here on Kaggle (see the comments).


Related Q&A

Whats the correct way to implement a metaclass with a different signature than `type`?

Say I want to implement a metaclass that should serve as a class factory. But unlike the type constructor, which takes 3 arguments, my metaclass should be callable without any arguments:Cls1 = MyMeta()…

Python -- Regex -- How to find a string between two sets of strings

Consider the following:<div id=hotlinklist><a href="foo1.com">Foo1</a><div id=hotlink><a href="/">Home</a></div><div id=hotlink><a…

Kivy TextInput horizontal and vertical align (centering text)

How to center a text horizontally in a TextInput in Kivy?I have the following screen:But I want to centralize my text like this:And this is part of my kv language:BoxLayout: orientation: verticalLabe…

How to capture python SSL(HTTPS) connection through fiddler2

Im trying to capture python SSL(HTTPS) connections through Fiddler2 local proxy. But I only got an error.codeimport requests requests.get("https://www.python.org", proxies={"http": …

removing leading 0 from matplotlib tick label formatting

How can I change the ticklabels of numeric decimal data (say between 0 and 1) to be "0", ".1", ".2" rather than "0.0", "0.1", "0.2" in matplo…

How do I check if an iterator is actually an iterator container?

I have a dummy example of an iterator container below (the real one reads a file too large to fit in memory):class DummyIterator:def __init__(self, max_value):self.max_value = max_valuedef __iter__(sel…

Python Terminated Thread Cannot Restart

I have a thread that gets executed when some action occurs. Given the logic of the program, the thread cannot possibly be started while another instance of it is still running. Yet when I call it a sec…

TypeError: NoneType object is not subscriptable [duplicate]

This question already has an answer here:mysqldb .. NoneType object is not subscriptable(1 answer)Closed 8 years ago.The error: names = curfetchone()[0]TypeError: NoneType object is not subscriptable. …

Where can I find numpy.where() source code? [duplicate]

This question already has answers here:How do I use numpy.where()? What should I pass, and what does the result mean? [closed](2 answers)Closed 4 years ago.I have already found the source for the num…

NSUserNotificationCenter.defaultUserNotificationCenter() returns None in python

I am trying to connect to the Mountain Lion notification center via python. Ive installed pyobjc and am following the instructions here and here. Also see: Working with Mountain Lions Notification Cent…