SFrame.read_csv ignoring column_type_hints

User 2568 | 6/14/2016, 10:32:26 PM

I'm having a few problems reading a CSV file I used

data = gl.SFrame.read_csv('Data/CustomersData.csv')

and got the error

Unable to parse line "X5537983,CN,SHANGHAI,2010-11-30 00:00:00.000,3,NULL,Cancelled,Male,Shanghai A,Jane Dong Sheng - Chen,293227,2012-12-24 00:00:00.000,2014-01-31 00:00:00.000,Inactive,Primary"
Unable to parse line "X5537983,CN,SHANGHAI,2010-11-30 00:00:00.000,3,NULL,Cancelled,Male,Shanghai A,Jane Dong Sheng - Chen,293228,2014-02-18 00:00:00.000,2015-02-28 00:00:00.000,Inactive,Primary"

The problem was the first column ('PMID') was inferred to be an int. So I tried

data = gl.SFrame.read_csv('Data/CustomersData.csv', column_type_hints={'PMID':str})

I got the same error along with the message 'These column type hints were not used: PMID'. Why is read_csv ignoring the hints?

ALSO, the error message was not particularly helpful. A more useful message would be to indicate which column was a problem and why

Comments

User 2568 | 6/14/2016, 10:43:10 PM

Found the problem, the first column was called \xef\xbb\xbfPMID. However the error message for the rows that could not be parsed could have been more helpful.


User 940 | 6/20/2016, 6:58:20 PM

Again, thanks for the comments. Will file feature requests.