[PowerGraph] SSSP utilizes 1 machine only in the cluster

I run an SSSP algorithm on 128 machines using the below command. The graph is a large road network graph, not power law. Since the diameter of the graph is large, it takes many iterations to finish. However, there is only one machine with 10% cpu utilization and the others have close to 1 or 2%.

mpiexec -f ~/slaves -n 128 \ "$GRAPHLABDIR"/release/toolkits/graphanalytics/sssp \ --source ${src} \ --directed 1 \ --engine sync \ --graph_opts ingress=auto \ --graph inputgraph \ --saveprefix outputdir

I tried to use auto and random partitioning, and the very same machine always have 10% utilization, while all others are almost not working. This machine is the first one in the ~/slaves file. I tried to find any where in the source code, such that more work is assigned for the MPI_PROCESS with rank=1, first in the ~/slaves file, with no luck.

Finally, kindly note this machine also send-out large data throughout the network to all other machines.

I appreciate if you have any explanation for this behavior.

Hi K You are using a deprecated version of our code. We recommend switching to GraphLab Create: https://dato.com/products/create/docs/generated/graphlab.shortestpath.create.html#graphlab.shortestpath.create where we have an SSSP implementation that case scale to a graph of 100,000,000,000 edges on a single machine. We also have a distributed version: https://dato.com/products/distributed/features.html

Regarding your question you may be seeing the graph finalization stage were one machine coordinates the graph structure among all machines, before the algorithm starts to run.

Thank you Danny for your quick and useful answer. I did not know graphlab is now deprecated. I have couple of follow up comments, though:

1- In my code, this behavior lasts for couple of hours until the execution is done. During loading/finalize, the CPU utilization increases by about 8-10% in all workers including the first one in the ~/slave file. The difference of almost 10% always exist.

2- Is their a manual or documentation for upgrading my scripts from GraphLab to GraphLab create?

3- Finally, I understand that GraphLab create is out-of-core, Is it possible to keep data in memory, always.!

Thanks, -K