API changes in GraphLab Create v1.0 toolkits

User 18 | 10/21/2014, 12:16:33 AM

GraphLab Create v1.0 includes a number of API changes across various toolkits. These changes are made in hopes of enabling a better overall usage experience. For instance, now you can call <code class="CodeInline">classifier.create</code> or <code class="CodeInline">recommender.create</code> without specifying a model, and the toolkit will automatically select a model for you. You can still create each model individually and customize them to your heart's content with things like <code class="CodeInline">svmclassifier.create</code> and <code class="CodeInline">factorizationrecommender.create</code>. The dual entry points provide a fast starting point for beginning users, as well as the ability to customize for expert users.

With that said, if you've been using previous versions of GraphLab Create, a number of changes will require corresponding changes in your existing code. Below is a comprehensive list to help ease that transition.

<b>Important model behavior changes</b> <ul> <li> The appropriate scaling for the hyperparameters for factorizationrecommender and rankingfactorization_recommender have now changed. <i class="Italic">Previous code that sets specific regularization values need to change to the new scale.</i> The values are now on the scale of a single data point, rather than on the scale of a dataset. So if your dataset had 1 million rows and you found 1.0 to be a good value for the regularization argument, you may want to try 1/N = 1e-6 for this argument when using v1.0. <li> Models saved in v0.9.1 or earlier are no longer loadable in v1. This is a result of extensive changes in the internal representation of the models. <i class="Italic">Please re-train and re-save the models in v1.</i> </ul>

<b class="Bold">API module level changes</b> <ul><li> New structure of recommender module: <ul><li> All recommender models now have a *recommender suffix. <li> Popularity and ItemMeans model are now both accessible through popularityrecommender. <li> LinearRegressionModel is no longer supported. Use factorizationrecommender or rankingfactorizationrecommender instead. </ul> <li> New top-level modules: <ul><li> classifier <li> regression </ul> <li> Previous classification and regression models now sit within the new classifier and regression modules and have a *classifier or *regression suffix. <li> New module level functionality: <ul><li> where there is a create(), there is now a getdefaultoptions(). <li> m.getcurrentoptions() returns the option values that are used to create the model <li> m.summary() no longer returns a dictionary </ul> <li> graphlab.textanalytics now contains utilities previously grouped under graphlab.text.util <li> pandas dataframes are no longer accepted as input </ul>

<b class="Bold">Name changes for input options and model fields</b> <ul><li> Across toolkits, the model field name pattern ‘trainabc’ is replaced with ‘trainingabc’: <ul><li> traintime -> trainingtime <li> trainrmse -> trainingrmse <li> trainiterations -> trainingiterations <li> model field ‘validationrmse’ is added (where applicable) <li> runtime -> trainingtime in kmeans and graph analytics </ul> <li> input option name pattern ‘nabc’ is changed to ‘numabc’: <ul><li> recommenders: nfactors -> numfactors <li> model field and output sframe column name change: abcid -> abcid <li> kmeans: clusterid -> clusterid <li> connectedcomponents: componentid -> componentid <li> graphcoloring: colorid -> colorid <li> kcore: coreid -> coreid </ul> <li> factorizationrecommender input option: binarytargets -> binarytarget <li> across toolkits: <ul><li> maxiters -> maxiterations <li> numiters or niters -> numiterations <li> all models now contain ‘numexamples’ </ul> <li> features counts and lists: <ul><li> m‘features’ contains a Html�I�M! ��7# ++����FYI: If you are using Anaconda and having problems with NumPyHello everyone,

I ran into an issue a few days ago and found out something that may be affecting many GraphLab users who use it with Anaconda on Windows. NumPy was unable to load, and consequently everything that requires it (Matplotlib etc).

It turns out that the current NumPy build (1.10.4) for Windows is problematic (more info here).

Possible workarounds are downgrading to build 1.10.1 or forcing an upgrade to 1.11.0 if your dependencies allow. Downgrading was easy for me using conda install numpy=1.10.1

Thanks for your attention!

RafaelMarkdown558,824,8414L���4L����}��Xj�8\j�1str�"��\j�Xj��\j�8bj�րi�1(׀i��g��b�j����Xj�\j�Xj�8\j�1.hpp(decrementdistributedcounter:787): Distributed Aggregation of likelihood. 0 remaining. INFO: distributedaggregator.hpp(decrementdistributed_counter:793): Aggregate completion of likelihood Likelihood: -3.22336e+08 INFO: dis

No Comments