User 3218 | 6/29/2016, 8:05:21 PM
I am working with a data set where users rate other users. This data is used to recommend users to each other. There are around 10M users and a billion ratings among the users (yes, some of our users are highly engaged!).
Right now, creating a factorization recommender to produce just 20 recommendations per user takes 18+ hours on a C3.xlarge instance. We can bump up the cores in other bigger C3 instances but the time taken pretty much similar order of magnitude. The code is textbook:
data = gl.SFrame.read_csv(local_ratings_file, header=False, column_type_hints=int, verbose=False)
model = gl.recommender.create(data, user_id='X1', item_id='X2', target='X3', verbose=True)
results = model.recommend(users=None, k=total_recs)
Is there a way to speed things up?