Graphlab Installation on Hadoop cluster

User 1429 | 3/28/2015, 7:51:05 PM

Sir, acluster but i am unable to do it. I followed these tutorials. http://bickson.blogspot.in/2012/10/deploying-graphlab-cluster-using-mpi.html https://github.com/graphlab-code/graphlab/blob/master/TUTORIALS.md

I am getting errors for the following commands

~/graphlab/scripts/mpirsync

~/graphlab/scripts/mpirsync

Errors: Open RTE detected a parse error in the hostfile: /home/map/machines It occured on line number 1 on token 5: map


[ubuntu:01091] [[1552,0],0] ORTEERRORLOG: Error in file base/rasbaseallocate.c at line 222 [ubuntu:01091] [[1552,0],0] ORTEERRORLOG: Error in file base/plmbaselaunchsupport.c at line 99 [ubuntu:01091] [[1552,0],0] ORTEERRORLOG: Error in file plmrsh_module.c at line 1173


Open RTE detected a parse error in the hostfile: /home/map/machines It occured on line number 1 on token 5: map


[ubuntu:01092] [[1559,0],0] ORTEERRORLOG: Error in file base/rasbaseallocate.c at line 222 [ubuntu:01092] [[1559,0],0] ORTEERRORLOG: Error in file base/plmbaselaunchsupport.c at line 99 [ubuntu:01092] [[1559,0],0] ORTEERRORLOG: Error in file plmrsh_module.c at line 1173

Please suggest good resources for doing this

Comments

User 1592 | 3/28/2015, 7:55:57 PM

You should have a file named /home/map/machines that includes the list of the machine dns names


User 1429 | 3/28/2015, 8:37:27 PM

sir , thank you for your reply. I created /home/map/machine file and included all 5 node ip addresses in it. But now i am getting following error

A daemon (pid 1276) died unexpectedly with status 127 while attempting to launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed shared libraries on the remote node. You may set your LDLIBRARYPATH to have the location of the shared libraries on the remote nodes and this will automatically be forwarded to the remote nodes.




bash: orted: command not found

A daemon (pid 1281) died unexpectedly with status 127 while attempting to launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed shared libraries on the remote node. You may set your LDLIBRARYPATH to have the location of the shared libraries on the remote nodes and this will automatically be forwarded to the remote nodes.



mpiexec.openmpi noticed that the job aborted, but has no info as to the process that caused that situation.