Difference between new_observation_data and new_user_data.

User 4540 | 4/11/2016, 6:57:30 AM

Hello, I am finding it bit difficult as to which one of these parameters to use in recommend(). I want to have this function/method provide recommendations for a new userid but the data I want to pass as a parameter in recommend() is from an existing user(item ids, ratings) in trained model, so as to check if the new user is similar to existing one . Which one of these 2 parameters to use in recommend() as per the above description? And what output to expect ? Also, if using this approach isn't correct for finding similarity between the users, any advice on it? Because as per my understanding the recommend model for a new user should return the same item-ratings as per the existing one ,which I plan to pass as an argument . Any insight into this will be really appreciable. //data has userid's, itemid's and ratings - 100 users. m=graphlab.recommender.itemsimilarityrecommender.create(data,userid='UserId', itemid='ItemId',target='Rating',similaritytype='cosine') m.recommend(users=101,newuserdata/newobservationdata=newdata)

Comments

User 91 | 4/11/2016, 4:59:12 PM

New observation data refers to the interactions of users with items i.e user U rated item I with rating R. While new user data only refers to user metadata i.e user U is Male and is of age 20.


User 4540 | 4/11/2016, 6:39:38 PM

Thank you for the quick reply. So in my scenario, I should be using new observation data since I plan to pass the existing user's rating of a particular item in recommend(). newdata=graphlab.SFrame({'UserId':[101],'ItemId':["X"],'Rating':[7]})
recommendation=result.recommend(users=[101],newobservationdata=newdata) The output i m getting for this isn't much understandable. It's excluding this particular itemId but does show the other recommended items. Also, is this the correct way of providing recommendations to a new user based on trained model ? Since my objective is to find if a new user is similar or not, is there any other approach you would suggest?


User 91 | 4/11/2016, 6:59:43 PM

That is the correct way to get recommendations for a new user.

If you want to find out similar users, you can use the get_similar_users function. Unfortunately, that doesn't have a new_user option to pass in data for a new user. We will definitely add that as a feature. For now, your workaround would be to use the nearest neighbours model to compute similar users i.e one model for recommendation and a another model for similar users. Hope that helps!


User 4540 | 4/11/2016, 8:27:43 PM

Thanks again for clarifying this. Yes, this makes sense, because I tried using getsimilarusers() and realized it doesn't work for new users and is only to find similarity amongst the existing users in trained model. So, I should be using nearest neighbor model and particularly query method for finding similar/closest points in the data to new user data?


User 4540 | 4/12/2016, 7:07:22 AM

Nearest neighbor classifier is a classification model approach and I am not sure if this would serve my objective of implementing user-based similarity here. So, I plan to implement recommendation algo that could not only provide recommendations to new users based on item preferences/ratings. But also, it should be able to return the similar users from the model who have the same ratings for certain items. Thank you for the help though, really appreciate it.


User 91 | 4/13/2016, 12:37:57 AM

The nearest neighbours model is not always for classification. You can use it for unsupervised learning as well i.e finding similar users.

Our user guide (https://dato.com/learn/userguide/nearestneighbors/nearestneighbors.html) can hopefully guide you through the process of:

  1. Define a distance function that captures "similarity between users"
  2. Train a model using your data
  3. Make predictions for a new user based on similarity to the existing users in your distance metric.