precision/recall metrics in GraphChi

User 227 | 4/22/2014, 5:24:18 PM

I'd like to reraise Denis Parra's issue about recommender system evaluation metrics for GraphChi. To my knowledge, these metrics are still not available.

In GraphChi, I am able to run

toedtli@phoenix:~/pre/graphchi$ ./toolkits/collaborativefiltering/svd --training=smallnetflixmm --test=smallnetflixmm --nsv=20 --nv=20 --maxiter=4 --quiet=1 --save_vector=1

WARNING: common.hpp(print_copyright:204): GraphChi Collaborative filtering library is written by Danny Bickson (c). Send any comments or bug reports to danny.bickson@gmail.com [training] => [smallnetflix_mm] [test] => [smallnetflixmm] [nsv] => [20] [nv] => [20] [max_iter] => [4] [quiet] => [1] [save_vector] => [1] Load matrix smallnetflixmm Starting iteration: 1 at time: 0.322977 Starting step: 1 at time: 0.599368 Starting step: 2 at time: 1.02481 ... Starting step: 18 at time: 32.7282 Starting step: 19 at time: 33.5347 Number of computed signular values 6 Singular value 0 3276.69 Error estimate: 0.000453408 Singular value 1 1064.07 Error estimate: 5.52205e-15 Singular value 2 956.598 Error estimate: 2.5496e-13 Singular value 3 891.019 Error estimate: 2.28888e-11 Singular value 4 742.118 Error estimate: 1.11722e-09 Singular value 5 695.528 Error estimate: 3.71108 Going to save output vectors U and V Lanczos finished 38.2352 Finished writing 3298163 predictions to file: smallnetflix_mm.predict

to get the predicted ratings:

toedtli@phoenix:~/pre/graphchi$ head smallnetflix_mm.predict %%MatrixMarket matrix coordinate real general %This file contains predictions of user/item pair, one prediction in each line. The first column is user id. The second column is the item id. The third column is the computed prediction. 95526 3561 3298163 13 1 0.066013243 83 1 0.079596292 127 1 0.10323781

So I assume I could write a script that collects the top-n items of test users and determine the true-positive, false-positive etc. rates by myself. But since there is no dependence on the actual recommender algorithm, shouldn't this be done inside GraphLab/GraphChi directly? Certainly this would be a great feature. I'll gladly volunteer as a beta tester.

Kind regards, Beat

Beat Tödtli, Dr.rer.nat. Researcher Informatics Department University of Applied Sciences Switzerland (FFHS) Althardstrasse 60 | CH-8105 Regensdorf Phone: +41 (0) 44 842 1568 Web: http://lws.ffhs.ch

No Comments