about GraphLab Cluster Deployment

User 816 | 11/9/2014, 1:45:11 PM

Hi everyone, I am following the instructions on http://graphlab.org/projects/tutorials.html#cluster to deploy graphlab on a cluster. When I execute mpirsync ( ~/graphlab/scripts/mpirsync), it says mpiexec.openmpi: command not found. I have already installed OpenMPI-1.8.3 and I can run some MPI sample programs. Actually I am very confused about mpirsync because I have never seen mpiexec.openmpi under MPI's bin directory. Can anyone help me with this, please? Any help would be greatly appreciated!


User 6 | 11/9/2014, 3:08:24 PM

Try to change mpiexec.openmpi to mpiexec and see if it works. On default this link exists to differentiate between mpiexec.openmpi to mpiexec.mpich2

User 816 | 11/10/2014, 1:04:36 AM

Hi DannyBickson, Thanks very much for your quick response. I have changed it to mpiexec and it works. I try to run the connected component on a cluster of 4 machines.

However, it runs 4 instances of the program, each computing CC individually instead of cooperating with each other. What is worse, I find that by tuning ncpus, the running time doesn't change. It does change when I run on a single multicore machine.

Here is my command (lifejournal dataset is divided into 4 files under lj directory ): mpiexec -n 4 -hostfile ./myhosts --bynode ./connectedcomponent --engineopts="type=synchronous" --graph=/home/xing/graphsystems/graphlab-master/release/toolkits/graphanalytics/lj --format=adj --ncpus=1 --saveprefix=ljout/cc

Any help would be greatly appreciated!

User 6 | 11/10/2014, 4:56:23 AM

It seems something is wrong with your MPI setup - please follow step 2 here: http://graphlab.org/projects/tutorials.html#perf_tuning to verify MPI is working correctly.

User 816 | 11/10/2014, 9:36:21 AM

Hi DannyBickson,

Thanks for your advice. I tried the rpc_example1. I got messages as attached. Briefly speaking, I got <i class="Italic"><i class="Italic">TCP Communication layer constructed Run with exactly 2 MPI nodes.</i></i>.

By the way, the command is mpiexec -n 2 -hostfile ./myhosts --bynode ~/xing/graphsystems/graphlab-master/release/demoapps/rpc/rpc_example1

User 6 | 11/11/2014, 9:36:09 AM

Your example does not show the desired output so I believe you run two seperate MPI instances which can not find each other. If you have both openmpi and mpich2 I suggest switching between the two and finding which works on your setup.