Should I save SGraph G by using G.save() or using SFrame to save edges and vertices

User 1286 | 2/15/2015, 8:39:27 AM

I am now dealing with big graph. I used SGraph to present. I tried to save it with G.save(), but It take so long time to save even not the big graph? Anyone have any suggestion for me to solve it?

Thank you so much,

Comments

User 1190 | 2/17/2015, 8:58:08 AM

Hi,

The SGraph is probably not materialized, and computation on the graph is being fully evaluated during save(). To correctly measure the save time for SGraph, try calling g.materialize() before g.save().

If the save() still takes very long, please let me know: 1. the size of the graph, number of edges, number of vertices, number of fields, and the field types 2. the spec of your machine. #cores, memory 3. the actual time it takes to save the materialized graph.

Thanks, Jay


User 1190 | 2/17/2015, 10:14:20 PM

Here's a simple benchmark on my dev-box (magnetic disk, 8g RAM, 8 cores, default graphlab configuration)

<pre><code> In [2]: g =gl.loadgraph('/data/webgraphs/sgraph/twiiterrv/') [INFO] Start server at: ipc:///tmp/graphlabserver-29849 - Server binary: /home/haijieg/glenv/lib/python2.7/site-packages/graphlab/unityserver - Server log: /tmp/graphlabserver1424210622.log [INFO] GraphLab Server Version: 1.3.0

In [3]: %time g.save('/tmp/twitter') CPU times: user 40 ms, sys: 28 ms, total: 68 ms Wall time: 2min 22s

In [4]: g Out[4]: SGraph({'numedges': 1468365182, 'numvertices': 41652230}) Vertex Fields:['__id'] Edge Fields:['src_id', 'dst_id'] </code></pre>


User 1286 | 2/24/2015, 9:54:19 AM

Thank you