Is there a way to convert an edge list to the binary format support by PowerGraph?

User 158 | 3/18/2014, 7:39:15 PM

Hello,

Is there any tool provided by the distribution of PowerGraph to convert an edge list to the binary format support by PowerGraph? It takes too long time for PowerGraph to read a large graph in the format of edge list.

Thanks, Da

Comments

User 33 | 3/19/2014, 2:01:30 AM

Maybe you can split your single input file, and PowerGraph can load them in parallel.


User 158 | 3/19/2014, 4:09:50 PM

Is it how people normally load a graph to memory? It seems to me that it's also slow to construct a graph in memory. I see PowerGraph also supports binary format. How do I get that format?


User 14 | 3/19/2014, 4:38:38 PM

Hi Zhengda,

The binary format is native to the graphlab and can only be obtained by loading the graph from text file first, and then save as binary. In other words, to get a graph in binary format, you need to write a small program that loads the graph from the edge list using loadformat(), and then save to binary using savebinary(). After that you can use load_binary() to load your graph.

As rongcheng mentioned, PowerGraph does loading files in parallel. If you have one huge file containing all edges in your graph, it is recommended to split it into multiple smaller files. Also gzip the files will increase your throughput as well.

Best, -jay