User 2568 | 5/3/2016, 3:30:00 AM
I'm competing in the the (Kaggel Expedia Hotel Recommendations)[https://www.kaggle.com/c/expedia-hotel-recommendations/data] and I have an SFrame of hotel search log data with 35M rows. I want to group by search id, and return the the top five hotel by relevance for each id. In effect I want aggregate.ARGMAX('hotel', 'relevance') to return a list of the top 5 items, not just one.
This is detailed in this notebook
I have three questions 1. In the notebook, is there a better way to write the code? It seem fine and the performance is Ok, but I'm interested in learning 2. I'd like to request that ARGMAX be extended to take an optional 3rd parameter, which determines how many items to return. This would not be unusual for recommendation type problems. 3. There is an odd behaviour noted at the end of the notebook, where an int() is converted to a float(). Is this a bug or my misunderstanding.