Any way to peek at or load a trained model during model_param_search?

User 1375 | 3/27/2015, 2:22:15 AM

Is there currently a way to load, say, the best performing model during a modelparamsearch run? I'm currently using environment.LocalAsync, and tried the following which didn't work.

This is the summary of the job

<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%">Info

Job : Model-Parameter-Search-Mar-26-2015-06-56-37 Environment : LocalAsync: [name: local-async-param-search, params: {}] Function(s) : traintestmodel-0-0, traintestmodel-1-0, traintest_model-2-0 ... (total 136 functions). Status : Running

Help

Visualize progress : self.show() Query status : self.getstatus() Get results : self.getresults()

Metrics

None

Execution Information

Process pid : 7907 Execution Directory : /tmp/task-exec-ydJheH Log file : /tmp/task-exec-ydJheH/execution.log </pre></div>

Here's what I tried

<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%">bestsofargbt = gl.loadmodel(&#39;/tmp/task-exec-ydJheH/traintest_model-79-0-1427352997.93.gl&#39;) </pre></div>

...and the error I received.

<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%">--------------------------------------------------------------------------- IOError Traceback (most recent call last) &lt;ipython-input-60-5951b6936d68&gt; in &lt;module&gt;() ----&gt; 1 bestsofargbt = gl.loadmodel(&#39;/tmp/task-exec-ydJheH/traintest_model-79-0-1427352997.93.gl&#39;)

/root/anaconda/envs/graphlab/lib/python2.7/site-packages/graphlab/toolkits/model.pyc in loadmodel(location) 51 mt.getmetrictracker().track(&#39;toolkit.model.loadmodel&#39;) 52 ---&gt; 53 return glconnect.getunity().loadmodel(makeinternalurl(location)) 54 55 def getdefaultoptionswrapper(unityservermodel_name,

cyunity.pyx in graphlab.cython.cyunity.UnityGlobalProxy.load_model()

cyunity.pyx in graphlab.cython.cyunity.UnityGlobalProxy.load_model()

IOError: Unable to load model from /tmp/task-exec-ydJheH/traintestmodel-79-0-1427352997.93.gl: Cannot open /tmp/task-exec-ydJheH/traintestmodel-79-0-1427352997.93.gl/dirarchive.ini for read. Cannot open /tmp/task-exec-ydJheH/traintestmodel-79-0-1427352997.93.gl/dir_archive.ini for reading </pre></div>

Comments

User 1178 | 3/27/2015, 5:25:53 PM

Hi Marcos,

The get_results() returns a dictionary that you can get the model:

	>>> job_result = job.get_results()
	>>> models = job_result['models']
	>>> summary_sframe = job_result['summary']

You can then get models you want.

Here is the API documentation talks more about the detail: https://dato.com/products/create/docs/graphlab.toolkits.modelparametersearch.html?highlight=modelparametersearch#module-graphlab.toolkits.modelparametersearch

Hope this helps!

Ping


User 1375 | 3/27/2015, 8:58:25 PM

Thank you @"Ping Wang" . My question was about getting/loading models during a job execution, not after the job has completed.


User 1190 | 4/6/2015, 6:35:55 PM

Hi @msainz,

The modelparametersearch composes a map and reduce phase. The map phase return for each parameter set, a created model, and an SFrame storing the evaluation metrics. The reduce phase simply concatenates the results from the map phase.

The API itself does not have a notion of the "best" model. I'm imagine you would like to do some early stopping of the search by comparing to the current best performing model?

Can you elaborate more on your use case for loading the best model during model parameter search?

-jay