What metric is used to compute mean_validation_accuracy reported by model_parameter_search?

User 2568 | 3/15/2016, 5:28:08 AM

I'm using gridsearch.create and/or randomsearch with 'metric':'auc' however the reported 'meanvalidationaccuracy' does not look to be "auc". To test this I setup the parameters with just one set of values like this:

params = {'target':'TARGET', 'random_seed':8923,
          'class_weights':'auto','metric':'auc',
          'early_stopping_rounds':20, 'max_iterations':500,
          'step_size': 0.1,
          'max_depth': 5,
          'column_subsample': 0.6}

I then called grid search

job = gl.grid_search.create(folds,gl.boosted_trees_classifier, params)
job.get_results().sort('mean_validation_accuracy', ascending=False)

the reported 'meanvalidationaccuracy' is 0.78608260984. When I run this directly, as follows:

auc = 0
for train, validate in folds:
    model = gl.boosted_trees_classifier.create(train, validation=validate, verbose=False, **params)
    auc += model.evaluate(validate, metric='auc')['auc']
    print model.evaluate(validate, metric='auc')['auc']

print "Average:", auc/5.0

I get Average: 0.836612402121

I'm guessing that the reported 'meanvalidationaccuracy' is not "auc" as set in 'metric'

The data and notesbook are in this repo

Comments

User 1190 | 3/16/2016, 6:08:49 AM

You are right. meanvalidationaccuray is the average of accuracy. Please refer to the API docs for details about evaluator used in modelparametersearch: https://dato.com/products/create/docs/generated/graphlab.toolkits.modelparametersearch.create.html#graphlab.toolkits.modelparametersearch.create


User 2568 | 3/16/2016, 8:35:59 AM

Ok, got it. So I need to write the following

def auc_eval(model, train, test): 
    return {'train_auc':    model.evaluate(train, metric='auc')['auc'],
            'validate_auc': model.evaluate(test,  metric='auc')['auc']}

params = {'target':'TARGET', 'random_seed':8923,
          'class_weights':'auto','metric':'auc', 
          'early_stopping_rounds':20, 'max_iterations':500,
          'step_size': [0.03, 0.07, 0.1],
          'max_depth': [3, 4, 7, 9],
          'column_subsample': [1, 0.8, 0.6]}

job = gl.grid_search.create(folds,gl.boosted_trees_classifier.create,
                             params, evaluator=auc_eval)

However, I'd like to try to convince you that this is perhaps not the best approach as it is inconsistent with how boostedtreesclassifier.create works and leads to unnecessary confusion.

In params metric is set to "auc" and this passed this both to search and create. When I run boostedtreesclassifier.create: 1. progress statements are calculated using the metric passed in params 2. early stopping is base on the metric passed in param 3. The returned model has use the metric passed in param.

Further the models returned by search have used the metric set in params and not the default. The common behavior is that the 'metric' parameter setting overrides the "default" metric, but this not the case in search.

My suggestion is search be changed to align with the more common behaviour.

P.S. I was also surprised that I can't pass evaluator in params.


User 1190 | 3/17/2016, 5:36:20 PM

You have a valid point and I apologize if this API causes confusion for you. We will update the default evaluator to take 'metric' parameter into account.

'evaluator' is not a parameter to classifier.create() but a paraemter to search.create(). That's why it cannot be included in params.

Thank you very much for you feedback.