Performance testing and materialisation of lazy evaluation

User 2568 | 5/8/2016, 10:47:39 PM

I'm working with a large data set (35m rows) and I'm interested in performance testing parts of my feature engineering code however the lazy evaluation makes just using %timeit unreliable.

What is the best way to force materialisation of the lazy evaluation. Is it to use len()?

Comments

User 2506 | 5/9/2016, 5:53:10 PM

you can do sf.materialize()

we are considering making this a real method soon.


User 2568 | 5/9/2016, 10:26:17 PM

I don't understand what 'making this a real method' means.


User 12 | 5/9/2016, 10:53:26 PM

Currently, materialize is hidden - it's actually sf.__materialize__(). I think Shaowei means that we're considering exposing it to users more clearly.


User 2568 | 5/9/2016, 11:22:25 PM

Thanks, that's exactly what I'm after!