About Powergraph with HDFS data

User 872 | 11/20/2014, 10:52:25 AM

Hi,

I test the computing speed when data in HDFS and in NFS. I found that the speed is lower when using HDFS. I am willing to know if you will improve computing speed with HDFS in next version. Thank you!

Best, Lyuwei

Comments

User 6 | 11/20/2014, 1:13:03 PM

Hi, PowerGraph open source code is soon going to be deprecated as we are switching to GraphLab Create, which is a our newer system.

HDFS most likely will be slower than NFS. This is not an issue of GraphLab but an issue of your storage layer. There are numerous parameters which may affect HDFS speed. Including: replication factor, compression, block sizes etc.

GraphLab Create can scale in a single computing node to input data files which are terabytes in size. Working with a local disk will give you the best performance for this scenario, however we do support HDFS in GraphLab Create. You are welcome to try it out.


User 872 | 11/22/2014, 7:38:31 AM

Thank you very much