Market Basket Format

User 230 | 5/8/2014, 5:52:20 PM

Hi there,

Still in the process of learning the ins and outs of Graphchi. One element I fail to understand is the expectations for the market basket for the scoring set (I have it working for validation and training). Specifically, I follow this to create mine: http://bickson.blogspot.co.uk/2012/02/matrix-market-format.html

It looks like that:

head 20140506recotoscoreGL_full.score:info %%MatrixMarket matrix coordinate real general 311671 103773 311541

The actual file looks like that: head 20140506recotoscoreGL_full.score 49745 40996 49749 619 136201 6195 136205 619

Any suggestion gents?

Thanks!

Comments

User 6 | 5/8/2014, 6:04:31 PM

Matrix market format requires a value to the non zero matrix entry. Since you are talking about the test data, this entry is not known, namely you can put "1" for example The actual file looks like that: head 20140506recotoscoreGL_full.score 49745 40996 1 49749 619 1 136201 6195 1 136205 619 1

By the way which algorithm are you using?


User 230 | 5/8/2014, 6:26:15 PM

Got it ... I also tried with 0 which was not a good idea :)

I'm using ALS. I added the 1s at the end but it didn't help:

head 20140506recotoscoreGL_full.score 49745 40996 1 49749 619 1 136201 6195 1 136205 619 1 220129 338 1 185637 619 1

(by tab delimited)

head 20140506recotoscoreGL_full.score\:info %%MatrixMarket matrix coordinate real general %Generated 05-01-2014 311671 103773 311541

Error returned: DEBUG: rmseengine.hpp(resetrmse:148): Detected number of threads: 8 7.2135) Iteration: 1 Training RMSE: 0.680708 INFO: graphchiengine.hpp(run:906): Finished updates INFO: als.cpp(outputalsresult:176): ALS output files (in matrix market format): 20140506recoinputGLfull.trainU.mm, 20140506recoinputGLfull.train_V.mm Error: Failed to read matrix market header: Success


User 6 | 5/8/2014, 6:34:59 PM

Please send us the full command line used. This error says that matrix market header failed to read.


User 230 | 5/8/2014, 6:41:56 PM

Sure. Here you go


User 6 | 5/8/2014, 6:53:19 PM

Hi Guy, Please get the latest from github (using git pull; make clean; make cf) I have added some more traces to debug the problem. Something is illegal in the banner content or location.


User 230 | 5/8/2014, 10:43:13 PM

I got the following message while reading the "test" (i.e. scoring) set: ERROR: mmio.c(mmreadbanner:119): scanf returned 3 FATAL: io.hpp(readmatrixmarketbannerandsize:56): Could not process Matrix Market banner. File: folder1/20140506recotoscoreGLfull.score.predict

I wonder if I'm supposed to cat for the :info header and the data file. When doing so, it works fine.

Hope this helps.


User 6 | 5/9/2014, 4:06:07 AM

Thanks Guy for the extended testing. The error message were previously not clear and now they are clearer. What happens is that the file with the matrix market header (folder1/20140506recotoscoreGL_full.score.predict:info) is not found and thus the program proceeds for trying and read the banner information from the input file which does not have banner in it. I guess there could be a typo in the file name, or maybe access permissions fail to open the info file.


User 230 | 5/9/2014, 5:20:19 PM

Thanks Danny. Got it now. Your help is much appreciated.