Load From HDFS problem

User 568 | 11/5/2014, 12:15:33 AM

When I tried to load graph data from hdfs, the log printed the following error: FATAL: distributedgraph.hpp(loadfrom_hdfs:2226): Attempting to load a graph from HDFS but GraphLab was built without HDFS.

However, libhdfs is built as shown in the config.log

-- Found Java: /usr/java/jdk1.7.067-cloudera/bin/java (found version "1.7.0.67") -- Found JNI: /usr/java/jdk1.7.067-cloudera/jre/lib/amd64/libjawt.so -- jni.h was found at /usr/java/jdk1.7.067-cloudera/include/jni.h -- Java home set by user: /usr/java/jdk1.7.067-cloudera -- Could NOT find ANT (missing: ANT_EXEC) -- Building libhdfs

Comments

User 6 | 11/5/2014, 7:38:46 AM

It seems HDFS setup did not complete successfully in the powergraph error message. We need more details to be able to debug.

We recommend switching to our newer code GraphLab Create which does support HDFS. Powergraph open source code is going to be deprecated soon.


User 568 | 11/5/2014, 1:10:46 PM

I am doing some benchmark experiment and has already finished half of it. So I want to stick to the current version.

What information do you need? Can you give the log name?


User 568 | 11/5/2014, 3:43:53 PM

Besides, should I specify the hadoop path in the command line?


User 6 | 11/7/2014, 6:59:43 AM

Please see here: http://forum.graphlab.com/messages/31#64 This may be also related to the hadoop command not being accessible in the path You should give the hadoop classpath in the command line as explained the the post above


User 568 | 11/7/2014, 3:49:28 PM

I have add the hadoop path.

mpiexec -n 1 -genv CLASSPATH $hadooppath ./mpagerank --graph hdfs://user/jing/orkutinput (I have manually set $hadooppath)

It still gives the following error: Using default values Subnet ID: 0.0.0.0 Subnet Mask: 0.0.0.0 Will find first IPv4 non-loopback address matching the subnet INFO: dc.cpp(init:573): Cluster of 1 instances created. INFO: distributedgraph.hpp(setingressmethod:3201): Automatically determine ingress method: grid 2014-11-07 09:43:19,237 WARN [main] util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(62)) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS: hdfs://user/jing/orkutinput, expected: file:/// at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645) at org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:80) at org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:367) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1485) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1525) at org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:568) Call to org.apache.hadoop.fs.FileSystem::listStatus failed! WARNING: distributedgraph.hpp(loadfromhdfs:2235): No files found matching hdfs://user/jing/orkutinput


User 6 | 11/7/2014, 3:58:34 PM

Take a look here: https://groups.google.com/forum/#!msg/graphlab-kdd/w6mieYS63DY/VeqU4SoCQoQJ it may be possible that some jars are missing from your CLASSPATH


User 568 | 11/7/2014, 4:43:17 PM

I have added all of this and even other extension library:(


User 568 | 11/7/2014, 4:43:41 PM

If jar is missing, the error should be ClassNotFound right?


User 6 | 11/7/2014, 4:48:25 PM

We already encountered in few cases were users reported this error and the issue was missing jars. Again I recommend switching to Graphlab Create were it will be easier for us to help you.