GraphLab Server Error?

User 512 | 12/12/2014, 4:19:25 PM

I was importing a big CSV file into Graphlab, and it keeps showing Server error after reading some lines:

	PROGRESS: Finished parsing file text.csv
	PROGRESS: Parsing completed. Parsed 100 lines in 1.05467 secs.
	Unable to infer column types. Defaulting to str
	Could not detect types. Using str for each column.
	PROGRESS: Read 93629 lines. Lines per second: 34651.6
	PROGRESS: Read 306381 lines. Lines per second: 31554.2
	Unable to reach server for 3 consecutive pings. Server is considered dead. Please exit and restart.
	Traceback (most recent call last):
	  File "<stdin>", line 2, in <module>
	  File "/usr/local/lib/python2.7/site-packages/graphlab/data_structures/sframe.py", line 994, in read_csv
	    store_errors=False)[0]
	  File "/usr/local/lib/python2.7/site-packages/graphlab/data_structures/sframe.py", line 599, in _read_csv_impl
	    errors = proxy.load_from_csvs(internal_url, parsing_config, type_hints)
	  File "/usr/local/lib/python2.7/site-packages/graphlab/cython/context.py", line 23, in __exit__
	    raise exc_type(exc_value)

Is there any way to fix this?

Thanks!

Comments

User 1037 | 12/12/2014, 6:42:37 PM

Hi @Shuning

Is it possible for you to send us your csv file that causes server error? We will look into it. If you don't want to share it on forum, feel free to send it to my email: jgu@graphlab.com

Thanks, Jay


User 512 | 12/12/2014, 9:56:25 PM

It is a pretty big file, about 300MB ...


User 14 | 12/12/2014, 10:14:06 PM

Looking at your log, the error happens after line 306381. Can you check if it can be reproduced for the first 500K lines of the csv, and send us the minimal data which causes the problem?

Thanks!


User 512 | 12/17/2014, 4:33:27 PM

Hmm, it seems this error is related to the Linux system. This server error keeps showing up on Redhat Enterprise Linux OS, but if I load the CSV file on Ubuntu desktop, no error at all. Not sure why ...


User 1037 | 12/17/2014, 6:51:52 PM

Interesting! We are really interested in finding out the root cause this issue. Would you mind sending us an input csv that will reproduce the error? If the input file is too big, we can give you an S3 bucket for uploading. Thanks!


User 512 | 12/17/2014, 9:11:10 PM

Sure, the file is about 110MB. And the error happened on Ret Hat Enterprise Linux Server 6.4. Please let me know how to upload this big file.


User 512 | 12/17/2014, 9:15:06 PM

In my Ubuntu VM machine, the process was done in 5 seconds without any server error:

PROGRESS: Parsing completed. Parsed 539283 lines in 5.49077 secs


User 512 | 12/30/2014, 11:01:27 PM

I tried it again under the Vitualenv, still not working. The disk is not full, maybe there is some compatibility issue with Redhat Linux?


User 940 | 12/31/2014, 6:45:11 PM

Hi Shuning,

Could you upload the file to something like Google Drive and e-mail me the file at piotr at graphlab.com?

Thank you for your patience! -Piotr


User 14 | 1/2/2015, 3:15:12 AM

Pitor, Shuning has sent me the file, however, I am unable to reproduce the problem. If you are interested in taking a look, I will send you a pointer.

Jay


User 512 | 1/8/2015, 10:06:18 PM

This becomes more interesting. I imported the original CSV file into Hive, then exported the "table" from Hive to another CSV file, now Graphlab has no problem reading in the newly generated CSV file. These two CSV files should be the same, I still don't know why the original file is not working ...


User 512 | 1/8/2015, 10:19:14 PM

Well, I have to take the above comment back. I tried it again, now neither of the CSV files worked, Graphlab showed the same error for both files. It seems that this problem happens most of the time, if I am lucky enough, I can get it work for one time.