SFrame.save() communication error

User 2075 | 7/7/2015, 4:07:57 AM

I have a SFrame object, about 40,000 rows and 5 columns and trying to save it onto disk by calling SFrame.save('dirtodisk'). When the SFrame size is small, it works perfectly. However, when I try to append new rows into SFrame and the size gradually increases, the save function pops error.

Specifically: the function is: aggtraintestallusers(range(1,50), inputfolddir, outputfolddir, timewindowrange=1) the parameter "range(1,50)" denotes I want to append user information (also a SFrame) from id 1 to 50. I tried user range(1,30) and (20,50), it works well, which means the user info SFrame has no problem. However when I tried range(1,50), the graphlab crashes and generate "Communication Failure: 65. ". The detailed error report is below:

RuntimeError Traceback (most recent call last) <ipython-input-3-ebd95796c70d> in <module>() 1 inputfolddir = './data/userssplit/' 2 outputfolddir = './data/usersagg/' ----> 3 aggtraintestallusers(range(10,50), inputfolddir, outputfolddir, timewindowrange=1)

<ipython-input-2-269f95ee728b> in aggtraintestallusers(userrange, inputfolddir, outputfolddir, timewindowrange) 27 aggtest = aggtest.append(sftest) 28 ---> 29 agghist.save(outputfolddir+str(itr)+'/histagg'), aggtrain.save(outputfolddir+str(itr)+'/trainagg'), aggtest.save(outputfolddir+str(itr)+'/testagg')

/Library/Python/2.7/site-packages/graphlab/datastructures/sframe.pyc in save(self, filename, format) 2842 self.proxy.saveascsv(url, {}) 2843 else: -> 2844 raise ValueError("Unsupported format: {}".format(format)) 2845 2846 def selectcolumn(self, key):

/Library/Python/2.7/site-packages/graphlab/cython/context.pyc in exit(self, exctype, excvalue, traceback) 29 def exit(self, exctype, excvalue, traceback): 30 if not self.showcythontrace and exctype: ---> 31 raise exctype(exc_value)

RuntimeError: Communication Failure: 65.

Comments

User 1178 | 7/7/2015, 4:03:10 PM

Hi Eric,

Can you try to insert a new line after agg_test.append() like the following and let me know your result?

` aggtest = aggtest.append(sf_test)

this is the new line of code to be inserted

agg_test.materialize()

continue your code...

`

What I am trying to isolate is the possible stack overflow if you have too many appends since internally we keep a query tree.

In the mean time, can you also send me your server log? The location of the server log is printed at beginning of your python session when you do "import graphlab"

Thanks!

Ping


User 2356 | 10/3/2015, 10:08:18 AM

related @"Ping Wang"

http://forum.dato.com/discussion/1313/large-sframe-appending-error#latest


User 940 | 10/6/2015, 5:59:31 PM

@abby,

Saw that, thanks! I posted an answer on your previous thread.

Cheers! -Piotr


User 3757 | 3/18/2016, 5:12:42 PM

That worked for me. Thanks!