Number of processes in a cluster

User 1032 | 12/7/2014, 6:52:15 AM


I'm experimenting with the graphanalytics applications on a cluster. I am experimenting with scalability, but when I run mpiexec with parameter -n greater than 64, the application crashes with a fatal error in MPIInit_thread:

Fatal error in MPIInitthread: Other MPI error, error stack: MPIRInitthread(413): Initialization failed (unknown)(): Other MPI error

The application works fine anywhere up to and including 64 cores, but the moment I increase the argument pass this, I get the fatal error. I am sure I'm allocating at least this many cores from the cluster. Is there any hint as to what could be going wrong here, or at least a way to get a better error message?


User 6 | 12/7/2014, 7:10:35 AM

-n is the number of mpi threads --ncpus is the number of cores on each machine

It seems your mpi version fails when trying to initialize more than 64 mpi threads