Comparison of sframes vs Pandas vs CSV vs R dataframes

User 2356 | 12/28/2015, 8:48:30 AM

Are there any comparison results between sframes vs Pandas vs CSV vs R dataframes for various performance metrics, like memory consumption, timings, running on huge datasets etc.

I am finding that sframes are not optimized for memory nor diskspace and dont see any performance benefits over pandas inspite of the claims made. So are there any formal experiments already done which help to evaluate all these options?

Comments

User 1592 | 12/28/2015, 7:19:43 PM

Hi We have many performance results to show that GraphLab are way faster and can scale to much larger datasets. See for example: https://twitter.com/boris_gorelik/status/558166375600902144 and also http://blog.dato.com/how-fast-are-out-of-core-algorithms


User 2356 | 1/7/2016, 12:58:09 PM

@DannyBickson @"Danny Bickson" Thanks What if the columns have text\categorical values , the comparison given is for numeric data. Also if boosted decision trees are used then it becomes difficult to handle it.