User 2568 | 1/10/2016, 10:13:25 PM
I was viewing this presentation on GBT to get some ideas on how to improve my model accuracy and reduce over fitting.
I want to produce a chart like this. That is, i want to plot the model training and validation accuracy by iteration.
The necessary data is calculated at each iteration, and with verbose=True printed out. Is this accessible after the model has been created? Scikit-learn looks like it returns this in the model
What I'd like to be able to write is something like this
train, validate = data.random_split(0.8, seed=8754) model = gl.boosted_trees_classifier.create(train, target='label', validate=validate) score = model['score'] # Get and array of 'iteration", 'training', 'validation' plt(score['iteration], score['training'], score['iteration], score['validation']]
From this I can see the optimal iteration and get a better idea of over fit etc. By returning this as part of the model this valuable data is available any time, which is especially useful when doing hyper-parameter searches.