Graphlab recommends items based on popularity when using ItemSimilarityRecommender.

User 5174 | 5/5/2016, 12:05:46 PM

Hello,

I have been using GLAB item similarity recommender with "pre-computed similarity matrix between items" in order to build a pure content-based RS recently: https://dato.com/products/create/docs/generated/graphlab.recommender.itemsimilarityrecommender.ItemSimilarityRecommender.html

In one of the experiments I realized when my item features are all zero (the corresponding similarity matrix is all 1), GLAB produces very good results! This came surprising to me how with all similar similarities, GLAB can produce such good results, it came strange and very surprising to me!

After checking the result of recommendation, I noticed the recommendations for all users are almost identical. The items recommended contained also high number of ratings. This made me believe that "recommendation based on popularity of items" is automatically chosen by the algorithm in the scenarios I described above. Though at first look, this may be good, but when evluating the performance of my RS I need to make sure that any results obtained are the result of content future solely and not other factors. Now, I am not sure when GLab uses recommendation based on popular items (where popular items being the ones with highest number of ratings)

Is there anyway in Glab I can switch off "recommendation based on popular items" when I am using itemsimilarityrecommender ? I prefer to have NO recommendation or random recommendation when I am using it with item similarities ALL equal.

It would be great if you could clarify this point.

Thanks YAS

Comments

User 1207 | 5/5/2016, 9:30:33 PM

Hello @YAS,

I'm not completely sure what's going on in your case. The popularity model is chosen if that user has rated no items, and there currently isn't a way to turn this off -- however, the popularity model just scores the recommended results by the average score, so if your item ratings are all zero, then it would still predict a score of zero. Regardless, you should be able to avoid this by only asking for recommendations for users that were already in the model. However, I may be missing something about what you are trying to do -- does that answer your question?

Thanks! -- Hoyt


User 5174 | 5/6/2016, 3:11:03 PM

Dear hoytak,

Thanks for your explanations and clarifying the case where and how popularity recommender can be chosen.

In my case, I am building a Pure Content-Based recommender system using itemsimilarityrecommender. The dataset is splitted by ratings. Please note that the similarities are solely based on item features (and no rating) that are computed offline and fed into itemsimilarityrecommender .

Now by coincidence, it happens that sometimes the similarity matrix provided to itemsimilarityrecommender is equal to "1" between all items. In such a case, GLab has a random choice for recommendation (basically all items are equal). Because we are doing research on the model, we do NOT want a recommendation based on popularity. If fora user the choice of which items to recommend is equal, we prefer NO recommendation or random recommendation.

When the model switches to popularity recommendation, what we have is that for extremely BAD features we receive extremely GOOD results (results based on popularity)! This is NOT what we want. In your view, what mechanisms we can adopt to avoid this?

To put is a nutshell: "How it is possible to build a pure CB recommender system in GLAB with precomputed similarities between items such that if the similarities are all equal, we receive a random recommendation (not popular)" ?

Your clarification will be of invaluable help and is appreciated in advance.

My Best YAS