Failing in reading a 8G json and MemoryError

User 3589 | 3/15/2016, 12:34:47 PM

I am working with a 8G json file and tried to .readjson to a sframe and I got the MemoryError: std::badalloc. I know that SFrame is constrained by disk size not my memory, so I wonder if it is the reading json process that constrains only memory and what can I do to deal with it. Is it possible to divide the original json file into pieces and join them in SFrame?

Sincerely, Louisa


User 15 | 3/15/2016, 9:27:05 PM

Hi @guanlu723,

Does your JSON file contain a single JSON object that is 8 GB, or is it meant to be one JSON object per line? If it's one object per line, you can supply the parameter orient='lines'. I think by default we assume it is one big object, which would be dependent on memory, since it would only occupy a single row in an SFrame. If your file is just one JSON object, then we are constrained by memory and you'd need a bigger machine.

Hope that helps!