how can pagerank in the Hadoop file system?

User 926 | 11/16/2014, 1:16:16 PM

Hi, I am new to Graphlab. Now, i have a file put it into the Hadoop file system. i want to use the graph analytic's pagerank to run this file with 1 master and 2 slaves. How can i do it? Thanks!!!!!


User 6 | 11/16/2014, 4:04:10 PM

An example of using PowerGraph here:

However, we strongly recommend switching to GraphLab Create, which has a pagernk implementation whcih supports HDFS:

User 926 | 11/16/2014, 6:13:37 PM

Hi Danny, Thank you. My friend also ask me to see the example in but i don't understand mpiexec -n 2 -hostfile ~/machines /path/to/als --matrix /some/ns/folder/smallnetflix/ --maxiter=3 --ncpus=1 --minval=1 --maxval=5 --predictions=outfile". because now i have a master and 2 slaves, and the file is located in the Hadoop file system. Where shall i put the path for the above commands? also, how can i know they are working with these 3 machines?

User 926 | 11/16/2014, 7:16:20 PM

then, i tried the command mpiexec -n 2 -hostfile ~/machines env CLASSPATH=~/hadoop/hadoop-core-1.2.1.jar ~/graphlab/release/toolkits/graph_analytics/pagerank --graph=hdfs:// --format=tsv --saveprefix=hdfs://

it shows the following error: Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/logging/LogFactory at org.apache.hadoop.conf.Configuration.<clinit>( Caused by: java.lang.ClassNotFoundException: org.apache.commons.logging.LogFactory at$ at$ at Method) at at java.lang.ClassLoader.loadClass( at sun.misc.Launcher$AppClassLoader.loadClass( at java.lang.ClassLoader.loadClass( ... 1 more Can't construct instance of class org.apache.hadoop.conf.Configuration ERROR: hdfs.hpp(hdfs:111): Check failed: filesystem != __null

I had checked the hdfs path is correct and i don't know why?