Does Graphlab Create support its own cluster or it depends on a Hadoop/Spark cluster?

User 1197 | 7/31/2015, 9:14:30 PM

Hi, Does Graphlab Create support its own cluster or it depends on a Hadoop/Spark cluster? If it supports its own cluster, can someone point me to the guide/link of how to set up such a cluster? Thanks!

Cheers, Dan

Comments

User 19 | 7/31/2015, 9:59:31 PM

Hi Dan,

GraphLab Create clusters can currently be created either in EC2 or on Hadoop YARN. See here for more details: https://dato.com/learn/userguide/deployment/pipeline-ec2-hadoop.html

Cheers, Chris


User 1197 | 7/31/2015, 10:32:45 PM

Hi, Chris, So Graphlab Create does not support its own cluster? I remember in the early days, people can set up a standalone Graphlab cluster and run graph analytics jobs like PageRank etc on it, by mpiexec .... What is the relation between the original Graphlab(in 2013), which is written in C++, with the current Graphlab Create? Thanks!

Cheers, Dan


User 19 | 7/31/2015, 11:14:49 PM

Hi Dan,

GraphLab Create can start a cluster on EC2, which can be used for doing jobs in parallel and running distributed machine learning algorithms. (The latter is in beta for interested customers.)

The original GraphLab C++ project provided a lot of inspiration for our graph toolkits and our distributed machine learning architecture. However, GraphLab Create leverages new data structures (the SFrame and the SGraph, also written in C++) that can handle large data sets on a single machine, rather than requiring an entire cluster as before.

Chris


User 1197 | 7/31/2015, 11:50:22 PM

Ok, I see. Thanks Chris!

Cheers, Dan