Who exactly the score(i,j) is computed on a recommender?

User 5170 | 5/3/2016, 3:43:59 PM

I'm tying to understand how the score of a predicted item is computed. I looked at https://dato.com/products/create/docs/generated/graphlab.recommender.factorizationrecommender.FactorizationRecommender.html#ga=1.231230623.1489653702.1460563617 (score(i,j) = μ + wi + wj+ aT xi+ bT yj + uiT vj) and I extracted the coefficients of my model. For a given item j I have the linear coefficient and the latent factors, that I'm supposing be vj (right?) For a given user i I have the linear coefficient and the latent factors, that I'm supposing be ui. For a given feature (time) I also have the linear and latent factors. Is that latent factor {a or b} or {xi or yj}?

So how exactly should I integrate then to compute the score(i,j)?

Comments

User 1207 | 5/4/2016, 1:47:31 AM

Hello @laraspin,

The equation you cited is for the case when the side information only has linear terms, not latent factors. In that case, the a and b are components of the linear model used to include the information other than the user and the item (which have both a linear weight and a latent factor). If your model has latent factors for everything (i.e. it was trained with sidedatafactorization = True), then it is calculated using the equation in the cited paper by Rendel, (1, 2) Steffen Rendle, “Factorization Machines,” in Proceedings of the 10th IEEE International Conference on Data Mining (ICDM), 2010. (link).

In addition, if the side features are categorical (e.g. strings, list of strings, dicts), then each unique key/index is assigned a dimension, and it is treated as a sparse vector. If it's a real value (or vector of real values), then it is also normalized across examples in the training data -- i.e. scaled by the std deviation -- for numerical stability.

Does that help clear up the issue?

Thanks! -- Hoyt