Where to input recommender context data?

User 1025 | 1/8/2015, 10:10:37 AM

Hi, In the newest release notes (for v1.2.1 http://graphlab.com/products/create/upgrade/), I see the following:

<b class="Bold">Recommender Toolkit</b> <ul> <li> Recommend method has been improved to allow for observation data to specify context for recommendations (e.g. recommending items based on day of week). </ul>

Could anyone please tell/show how to use this new feature, or where it is documented? I don't see it here: http://graphlab.com/products/create/docs/generated/graphlab.recommender.create.html#graphlab.recommender.create

Thank you!!

Comments

User 6 | 1/8/2015, 1:57:07 PM

There are three types of additional data you can throw into the recommender for receiving improved performance (a) user data like user age, zip code and education level etc. (b) item data like category, weight, shape, price etc. (c) rating information like time of rating, paid amount etc.

Please see the documentation <a href="https://dato.com/products/create/docs/generated/graphlab.recommender.rankingfactorizationrecommender.create.html#graphlab.recommender.rankingfactorizationrecommender.create">here</a>. A relevant flag is "sidedatafactorization". On default it is set to True, which means any additional columns on your data SFrame are treated as side data which will be fed into the algorithm.


User 1025 | 1/8/2015, 5:16:04 PM

Thanks @DannyBickson‌. I'm still a bit confused as to the input place/structure for the "(c) rating information...". In the documentation link you give, I see user and item side info

&nbsp;&nbsp;&nbsp;&nbsp;<i class="Italic">userdata=None, itemdata=None </i> but how/where do I specify the rating time/day, etc?

Is it just that any date or time column in the observation_data SFrame gets automatically used in this fashion?

Thanks again, and congratulations on the name change and new funding!


User 6 | 1/8/2015, 5:21:21 PM

Any additional columns in the rating line besides the userid and itemid are treated as side features for the rating. So you can include the date or time column there as well. (If it is a unix timestamp we recommend splitting it into numeric day, month, year, hour, minute, second columns to find correlation between items that are bought on the same hour/month/ etc.


User 1025 | 1/8/2015, 5:54:06 PM

What's the "rating line"? Perhaps you could show this in the context of the Dato example:

sf = graphlab.SFrame({'userid': ["0", "0", "0", "1", "1", "2", "2", "2"], ... 'itemid': ["a", "b", "c", "a", "b", "b", "c", "d"], ... 'rating': [1, 3, 2, 5, 4, 1, 4, 3]})

userinfo = graphlab.SFrame({'userid': ["0", "1", "2"], ... 'name': ["Alice", "Bob", "Charlie"], ... 'numericfeature': [0.1, 12, 22]}) iteminfo = graphlab.SFrame({'itemid': ["a", "b", "c", d"], ... 'name': ["item1", "item2", "item3", "item4"], ... 'dictfeature': [{'a' : 23}, {'a' : 13}, ... {'b' : 1}, ... {'a' : 23, 'b' : 32}]}) m2 = rankingfactorizationrecommender.create(sf, target='rating', ... userdata=userinfo, ... itemdata=iteminfo)

Where is the "rating line" for date entry?

Thank you.


User 1025 | 1/8/2015, 6:06:46 PM

I found it in the documentation:

<i>Additionally, observation-specific information, such as the time of day when the user rated the item, can also be included. Any column in the observation_data SFrame that is not the user id, item id, or target is treated as a <b class="Bold">observation</b> side features. The same side feature columns must be present when calling :meth:predict.</i>

This clears up my confusion, which was that the extra columns were used for user or item side info. :smile: