Compute resource required for GraphLab

User 284 | 5/27/2014, 6:23:13 PM

I've a code that run successfully on a single node and I need now to run on a cluster. So, what are specification required to run GraphLab code on efficiently a cluster such as 1. How many machines are needed? 2. How many CPUs per machine? 3. How much RAM per machine? such 4. How much disk space per machine?

Comments

User 6 | 5/28/2014, 5:25:07 PM

This is a hard question, since the task is related to your dataset and algorithm. Basically, from the one hand you would like to utilize the smallest number of machines that GraphLab can fit the problem into memory. From the other hand, when adding more machines the algorithm should (in theory) run faster, and thus sometimes it is useful to add more machines.

Typically users start from the other way around: they have a set of X machines with configuration Y and then they want to run GraphLab on.

And to your question - machine with additional cores is preferred. For compiling GraphLab we require at least 2 cores and 4GB but better have 8-24 cores and 32 - 256GB. We require a few GB of disk space for compiling GraphLab. The rest depends on your problem size and algorithm.