SFrame read_csv -- parse lines parameter

User 3330 | 3/28/2016, 8:15:06 PM

I was hoping that there could be a parameter in the read_csv method to change the number of lines that are used to guess the data types of the columns.

It has been my experience that the 100 lines used to guess the column data types does not accurately find the data types. However in switching this to 1000 lines, the parser then guesses correct 100% of the time (in my experience). I have been just changing the SFrame.py file to 1,000 lines and it takes 1.08463 seconds with 69 columns. However, I have to go in and change the file manually and then recompile the .pyc file. This is a bit onerous for a simple parameter (it seems)

Comments

User 2917 | 3/28/2016, 9:01:52 PM

Thank you, I have shared your feedback with the team. Please contact us if you have any further questions!