Training implicit ALS solver in Ranking Factorization Recommender for million song dataset

User 1314 | 3/2/2015, 8:00:52 PM

Hi,

I'm following <a href="http://dato.com/learn/gallery/notebooks/recsysrank10K_song.html">Chris's millison song tutorial</a>. When I use the same training data to train a ials ranking factorization recommender, I'm not seeing RMSE in the training phase... When I call evaluate function to get RMSE and precision, RMSE is > 1... which is concerning me.

Few questions: 1) Would you help me if I'm using the function call correctly?

2) Shouldn't RMSE <= 1 for this implicit collab approach?

Version: Graphlab Create 1.3

Code: http://nbviewer.ipython.org/gist/jinsu/d31075fb4ccf2972447a

Comments

User 19 | 3/2/2015, 9:02:25 PM

Hi,

If your goal is to predict some target value, you should use factorization_recommender rather than ranking_factorization_recommender. With that change I get an RMSE of about 2.22 in your example.

If you'd like to read more about this, check out the "Choosing a model" subsection in the Recommender section of the <a href="http://dato.com/learn/userguide/index.html">userguide</a>.

Chris


User 1314 | 3/2/2015, 9:46:26 PM

Chris, Thanks for the fast reply. My goal is to predict rankings as in get song recommendations for a user based on musics the user has listened to, not predicting a listencount or a target value. Is rankingfactorization_recommender still the model I should use?


User 19 | 3/2/2015, 10:15:50 PM

Thanks for the clarification.

In that case, you should use rankingfactorizationrecommender as you had done originally, but RMSE is not necessarily a reasonable metric to use for evaluation. RMSE will evaluate how closely the model is predicting the listen_count, rather than how well your model ranks items for each user. Try using m.evaluate(test) and inspecting the precision/recall score for your model.

For more on this, you might be interested in the "Evaluating Model Performance" section of the userguide.

If you have any other questions, or run into any issues, please let us know!


User 1314 | 3/3/2015, 7:42:14 PM

I understand that RMSE isn't the best metric but I'm looking to get a reasonable value to tune the model parameters. That's been the case for all of the graphlab examples I've ran into.

Are you suggesting that I tune the parameters based on a custom_evaluation function that combines the precision and recall values? I'd really appreciate to find out how the ials solver was designed to tune.


User 19 | 3/4/2015, 2:02:13 AM

Yes, if you are using modelparamsearch you should consider making a custom_evaluation function that uses the results from m.evaluate(test). For example you might be interested in optimizing precision at 5.