Memory issues with tripple_apply

User 1129 | 2/18/2016, 10:13:05 AM

From time to time, I experience a program crash with the following message: "RuntimeError: Runtime Exception. Runtime Exception: 113. Fail executing the lambda function. The lambda worker may have run out of memory or crashed because it captured objects that cannot be properly serialized." Most of the times, when I repeat the calculation on the same data, it completes without any problems. This fact indicates that the problem is with the memory, and not the type of the stored data

The code in question computes various degree types, as following: ` def degreeupdate(s, e, d): s['degree_out'] +=1 d['degree_in'] += 1 w = e['count'] s['wdegree_out'] += w d['wdegree_in'] += w

    some_type = e['some_type']
    if rec_type is None:
        s['type_specific_degree'] += 1 # only on src
        if some_type == 'A':
            s['type_specific_degree_A'] += 1
            s['type_specific_degree_B'] += 1
    return s, e, d
gg = gg.triple_apply(_degree_update, mutated_fields)


As you may see, the function operates on a large number of fields. In addition, due to this bug, I have to pass all the graph fields to the function. Due to this bug, splitting the _degree_update function to smaller ones will not reduce the amount of data that triple_apply sends to it. Is there a way to prevent such errors?


User 1190 | 2/18/2016, 10:34:53 PM

Thanks for reporting the issue. Looks like there is a memory leak in sgraph.triple_apply. I filed an issue to the github repo to track this issue.

User 1592 | 2/19/2016, 5:59:34 AM

A fixed was submitted by Jay - thanks Jay!

User 1190 | 2/19/2016, 7:25:21 PM

The fix has been merged. Please test out and let me know if your issue is resolved.

User 1129 | 2/25/2016, 2:26:31 PM

I used pip to upgrade SFrame:


In [8]: sframe.version Out[8]: '1.8.3' `

Is this the right version?

User 19 | 2/25/2016, 5:48:40 PM

Yes, that is the latest version.