User 231 | 5/9/2014, 5:56:34 AM
These days, I am working on a large graph and test simple Single Source Shortest Path task inside the toolkit package. The problem I met is that GraphLab spend tooo much time on loading ( and partitioning, I guess) the graph. My graph has 172,655,479 vertex, and 1,544,271,504 edges. It's fairly large. I run the SSSP code on a 32 nodes cluster ( each is 4 cores, 8G mem). The total task spend 4000secs to finish. The logs showed that the real engine time is just 19secs! It seems all the 1hours+ is doing the loading and partitioning the graph data.
Is there any way to speed up the loading process ? Or is there some other configuration that I missed ? Thanks for your answers!