In Memory processing

User 11 | 12/5/2015, 12:18:00 AM

I understand that GraphLab Create supports out-of-core graph processing. I wonder if I can configure it to perform in memory processing instead. What are the configuration parameters?

Thanks, -K

Comments

User 954 | 12/5/2015, 12:56:44 AM

Hi,

Please look at graphlab.set_runtime_config? and especially the parameters: *GRAPHLABFILEIOMAXIMUMCACHECAPACITYPERFILE GRAPHLABFILEIOMAXIMUMCACHECAPACITY*

If you set those configuration parameters to a large value, SFrames/SArrays ( and SGraph) will flush out to disk less frequently and result in better performance.


User 11 | 12/10/2015, 7:14:44 PM

Thank you for your reply. Is there any way to force GraphLab to raise an error if the memory is not sufficient?

I do not want to flush less frequently to the disk, I want to prevent it.

Thanks again, -Khaled


User 1207 | 12/10/2015, 10:51:08 PM

Hello Khaled,

All the data structures are backed by disk, but if the cache size is very large things should never get written out. The downside is that if you do run out of memory, some things may then end up failing with an out-of-memory error, and depending on where in the code this happens it may cause unexpected problems.

What OS are you using? If you are using linux, another option is to create a tmpfs image -- which is just a directory that maps to RAM -- and then set 'GRAPHLABCACHEFILE_LOCATIONS' to that directory.

-- Hoyt


User 11 | 12/10/2015, 10:54:51 PM

Thank you Hoytak. I am using Ubuntu. tmpfs images sounds like an interesting idea. Thanks!