Variation of the target metric from grid search

User 5179 | 5/12/2016, 7:57:22 PM

How can I get the variation of the target metric from randomsearch? I only get the mean: +-----------------+----------------------+----------------+ | foldid | modelid | meanf1_score | +-----------------+----------------------+----------------+ | [4, 3, 2, 1, 0] | [9, 8, 7, 6, 5] | 0.591937280424 | | [3, 2, 4, 0, 1] | [18, 17, 19, 15, 16] | 0.59480439625 | | [1, 0, 3, 2, 4] | [1, 0, 3, 2, 4] | 0.596459287015 | | [1, 0, 3, 2, 4] | [11, 10, 13, 12, 14] | 0.595148523283 | | [0, 2, 1, 3, 4] | [30, 32, 31, 33, 34] | 0.594564453851 | | [2, 4, 1, 0, 3] | [27, 29, 26, 25, 28] | 0.595919681229 | +-----------------+----------------------+----------------+

Is there a way to add the variation of the metric within folds? How to get the best model, not just the best parameters?

Comments

User 19 | 5/16/2016, 8:00:55 PM

Hi Demir,

In order to get an alternative summary of the model parameter search, you will need to combine the raw results from each job that was launched. Here's an example for you to try:

` import graphlab as gl url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data' data = gl.SFrame.readcsv(url, header=False) data.rename({'X5': 'target'}) folds = gl.crossvalidation.KFold(data, 5) (train, valid) = data.random_split(.8)

params = {'target': 'target'} job = gl.modelparametersearch.create(folds, gl.boostedtreesclassifier.create, params)

Concatenate raw results from every job

sf = gl.SFrame() for j in job.jobs: s = j.get_results()['summary'] sf = sf.append(s)

Do your own summary across results

sf = sf.unpack('metric').unpack('metadata').unpack('parameters') context = [c for c in sf.columnnames() if c.startswith('parameters')] sf.groupby(context, {'meanacc': gl.aggregate.MEAN('metric.validationaccuracy'), 'varacc': gl.aggregate.VAR('metric.validationaccuracy'), 'folds': gl.aggregate.CONCAT('metadata.foldid')}) `

Every model created during the search is available via job.get_models(), and the parameters that correspond to each model is available in the metadata column of the sf variable above. However, remember each of these is created for a single fold. It might be advisable (depending on your application) to retrain a model using the best parameters according to the model parameter search routine.

Let us know if that helps! Chris


User 5179 | 5/17/2016, 10:50:24 AM

@ChrisDuBois Very nice! Thanks!