Upgraded Recently to Version 1.3 and need equivalent configuration setting

User 464 | 2/25/2015, 12:59:56 AM

Hi, I just upgraded to version 1.3 from 1.1 and had a memory feature set in my program:

gl.set_runtime_config('GRAPHLAB_SFRAME_SORT_BUFFER_NUM_CELLS', 10000000)

Which now returns an error. Commenting this line out results in the original 'Communication Failure: Unable to reach server for 3 consecutive pings.'

Is there a similar config setting in 1.3?

Comments

User 1190 | 2/25/2015, 8:06:44 AM

Can you try tweaking the sort buffer size?

 The maximum estimated memory consumption sort is allowed to use. Increasing
 this will increase the size of each sort partition, and will increase          
 performance with increased memory consumption. Defaults to 2GB.                

If you are running out of memory, you can try lowering the value to 1GB. <pre><code> gl.setruntimeconfig('GRAPHLABSFRAMESORTBUFFERSIZE', 110241024*1024) </code></pre>


User 464 | 2/26/2015, 4:24:09 PM

Thanks for the quick response, Jay. I've tried several different settings:

gl.setruntimeconfig('GRAPHLABSFRAMESORTBUFFERSIZE', 110241024*1024) gl.setruntimeconfig('GRAPHLABSFRAMESORTBUFFERSIZE', 0.510241024*1024) gl.setruntimeconfig('GRAPHLABSFRAMESORTBUFFERSIZE', 0.110241024*1024)

My machine has 16GB ram, so I didn't keep reducing the size. Everytime I reduced the sort-buffer-size setting it would take longer to use up all the ram, but it still did eventually and then would give the Communication Failure error.

Since the work was slightly time-sensitive, I uninstalled GL version 1.3 and reinstalled version 1.1 and things are working again.

I don't think I can recreate a working example, since the SFrame it's sorting is quite large, but I can give you some ideas to recreate the error on your own. I'm sorting a 2.5GB file of addresses by Zip Code. I know that the sort is lazy, so after the sort, the code has a call 'SFrame.addrownumber()' which would force the sort and add an index. It's here when it hangs.

Let me know if you need more information!


User 15 | 2/26/2015, 6:44:48 PM

Hey Nick,

Do you happen to have the server log from the 1.3 run that produces the communication failure? That would help me see what's going on.

For memory problems in sort...I can't really reproduce it unless I have an idea of the distribution of the sort keys.

Thanks,

Evan