popularity_recommender methodology

User 1131 | 12/29/2014, 5:39:53 PM

Hi, Would it be possible to have an extended version of the documentation for this function (popularity_recommender.create()) ? I mean what is the algorithm behind it and what are the effects of adding user or item information ? I see on the log some Tree level information. Does this mean a decision tree is used for predictions ? Thank you very much. Pierre

Comments

User 6 | 12/29/2014, 5:41:49 PM

Hi, Documentation is here: http://graphlab.com/products/create/docs/generated/graphlab.recommender.popularity_recommender.PopularityRecommender.html Popularity based recommender simply recommend the most popular items to all users. This can be used as a baseline for the other methods that should improve this simple heuristics.


User 18 | 12/29/2014, 8:15:42 PM

Hi,

The popularity model does not use user or item side information when making recommendations. It simply ranks the items by the number of users who like them (or, if the training data contains a column of target scores, it uses the average score of the item). The most popular items are assumed to be the most preferred, for all users. This is a very simplistic model. We recommend using it as a baseline, or as the background model for new users (to be combined with more sophisticated models for users with more observations in the training data).

The Tree Level information you see at the end is the output of the nearest neighbor model that is constructed to answer getsimilaritems() queries. I'll make a note to output more information in the log so that this is not so confusing. Thanks for bringing this up.

Alice


User 1131 | 12/31/2014, 9:00:47 AM

Thank you for your quick answer.

I was asking about user and item data because in the doc it is written for item item collaborative filtering that "(NB: This argument is currently ignored by this model.)" but not for this one. Also I do have different performances when using "gl.evaluation.rmse(test['score'], m.predict(test))" if I add item and user data to the model so it must change something somewhere.

I have one more question : are you considering adding some others popularity recommender : for example with wilson estimate / bayesian estimate / reddit system ... etc ? That could be nice for users with no bought product / movie ratings etc.

Happy new year


User 89 | 1/1/2015, 7:23:42 AM

Hey PierreG,

The user and item side information should be ignored in the popularity model. However, there was a bug in version 1.2 that has the potential to cause the behavior you're seeing, and we've since released a bugfix version, 1.2.1, that fixes it. If you are using 1.2, hopefully this update will help -- just do "pip install --update graphlab-create".

As for the documentation of the popularity recommender, not explicitly stating that side data is ignored is an oversight on our part -- thank you for pointing it out, and we'll clarify that in the documentation.

Finally, thank you for your thoughts about further enhancements to the recommender. We'll consider those. Right now, for users with no previous ratings, the other models do recommend something very similar to what the popularity based recommender would, so it should handle those cases seamlessly. However, we'll definitely consider your suggestions as additional enhancements.

Thanks, and happy new year as well!