read_csv and NULL

User 2568 | 6/14/2016, 10:38:34 PM

I read a CSV file using

data = gl.SFrame.read_csv('Data/CustomersData.csv', nrows_to_infer=1000)

and got the message

Finished parsing file /home/ec2-user/Notebooks/Data/CustomersData.csv
Parsing completed. Parsed 1000 lines in 3.17911 secs.
------------------------------------------------------
Inferred types from first 1000 line(s) of file as 
column_type_hints=[str,str,str,str,int,str,str,str,str,str,int,str,str,str,str]
If parsing fails due to incorrect types, you can correct
the inferred type list above and pass it to read_csv in
the column_type_hints argument
------------------------------------------------------
Unable to parse line "11104311,NZ,CHRISTCHURCH,2014-04-17 00:00:00.000,NULL,NULL,Lapsed Member,Male,Auckland A,Oliver - Crisford,908512,2014-04-17 00:00:00.000,2015-05-31 00:00:00.000,Inactive,Supplementary"
Unable to parse line "21096170,ID,JAKARTA BARAT,2014-09-18 00:00:00.000,NULL,NULL,Lapsed Member,Male,Jakarta A,RENY - RUDYANINGSIH,908470,2014-09-18 00:00:00.000,2015-09-30 00:00:00.000,Inactive,Primary"

This error message is not very helpful, i.e., which column is the problem and why. After some guessing, I thought it might be the 5th column, which is and int, and the data is the string NULL. Once I figured out what the problem was, the solution was simple, add na_values=['NULL']

It would be helpful to have more specific error messages when reading CSV

Comments

User 940 | 6/20/2016, 6:58:43 PM

Noted. Thanks!