Access train and evaluation error in xgboost

2024/10/2 10:27:40

I started using python xgboost backage. Is there a way to get training and validation errors at each training epoch? I can't find one in the documentation

Have trained a simple model and got output:

[09:17:37] src/tree/updater_prune.cc:74: tree pruning end, 1 roots,124 extra nodes, 0 pruned nodes, max_depth=6

[0] eval-rmse:0.407474 train-rmse:0.346349 [09:17:37]src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 116 extranodes, 0 pruned nodes, max_depth=6

1 eval-rmse:0.410902 train-rmse:0.339925 [09:17:38]src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 124 extranodes, 0 pruned nodes, max_depth=6

[2] eval-rmse:0.413563 train-rmse:0.335941 [09:17:38]src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 126 extranodes, 0 pruned nodes, max_depth=6

[3] eval-rmse:0.418412 train-rmse:0.333071 [09:17:38]src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 114 extranodes, 0 pruned nodes, max_depth=6

However I need to pass these eval-rmse and train-rmse further in code or at least plot these curves.

Answer

One way to save your intermediate results is by passing evals_result argument to xgb.train method.

Let's say you have created a train and an eval matrix in XGB format, and have initialized some parameters params for XGBoost (In my case, params = {'max_depth':2, 'eta':1, 'silent':1, 'objective':'binary:logistic' }).

  1. Create an empty dict

    progress = dict()

  2. Create a watchlist, (I guess you already have it given that you are printing train-rmse)

    watchlist = [(train,'train-rmse'), (eval, 'eval-rmse')]

  3. Pass these to xgb.train

    bst = xgb.train(param, train, 10, watchlist, evals_result=progress)

At the end of iteration, the progress dictionary will contain the desired train/validation errors

> print progress
{'train-rmse': {'error': ['0.50000', ....]}, 'eval-rmse': { 'error': ['0.5000',....]}}
https://en.xdnf.cn/q/70859.html

Related Q&A

Gtk* backend requires pygtk to be installed

From within a virtual environment, trying to load a script which uses matplotlibs GTKAgg backend, I fail with the following traceback:Traceback (most recent call last):File "<stdin>", l…

ValueError: A value in x_new is below the interpolation range

This is a scikit-learn error that I get when I domy_estimator = LassoLarsCV(fit_intercept=False, normalize=False, positive=True, max_n_alphas=1e5)Note that if I decrease max_n_alphas from 1e5 down to 1…

Parsing Python function calls to get argument positions

I want code that can analyze a function call like this:whatever(foo, baz(), puppet, 24+2, meow=3, *meowargs, **meowargs)And return the positions of each and every argument, in this case foo, baz(), pup…

Is there a proper way to subclass Tensorflows Dataset?

I was looking at different ways that one can do custom Tensorflow datasets, and I was used to looking at PyTorchs datasets, but when I went to look at Tensorflows datasets, I saw this example: class Ar…

Install pyserial Mac OS 10.10?

Attempting to communicate with Arduino serial ports using Python 2.7. Have downloaded pyserial 2.7 (unzipped and put folder pyserial folder in python application folder). Didnt work error message. &quo…

Binning frequency distribution in Python

I have data in the two lists value and freq like this:value freq 1 2 2 1 3 3 6 2 7 3 8 3 ....and I want the output to be bin freq 1-3 6 4-6 2 7-9 6 ...I can write fe…

R style data-axis buffer in matplotlib

R plots automatically set the x and y limits to put some space between the data and the axes. I was wondering if there is a way for matplotlib to do the same automatically. If not, is there a good form…

Python code for the coin toss issues

Ive been writing a program in python that simulates 100 coin tosses and gives the total number of tosses. The problem is that I also want to print the total number of heads and tails.Heres my code:impo…

Preprocess a Tensorflow tensor in Numpy

I have set up a CNN in Tensorflow where I read my data with a TFRecordReader. It works well but I would like to do some more preprocessing and data augmentation than offered by the tf.image functions. …

Os.path : can you explain this behavior?

I love Python because it comes batteries included, and I use built-in functions, a lot, to do the dirty job for me.I have always been using happily the os.path module to deal with file path but recentl…