ipython crashes using SFrame.read_csv

User 5308 | 6/20/2016, 4:06:12 PM

I'm evaluating GraphLab Create on Mac OS 10.11.5. When attempting to create an SFrame from an existing CSV file, the ipython session crashes:

songs = gl.SFrame.read_csv("http://s3.amazonaws.com/dato-datasets/millionsong/song_data.csv")

libc++abi.dylib: terminating with uncaught exception of type boost::archive::iterators::dataflow_exception: attempt to decode a value not in base64 char set /Users/a1152802/anaconda/envs/dato-env/bin/python.app: line 3: 68003 Abort trap: 6

I've also tried several local CSV files, all of which cause the same exception.

Comments

User 940 | 6/20/2016, 5:45:54 PM

Hi @"Brennan Cleveland" ,

Thanks for pointing this out. We're currently trying to repro this, and we will keep you in the loop as we fix it.

Cheers! -Piotr


User 940 | 6/20/2016, 5:56:53 PM

Hi @"Brennan Cleveland" ,

Could you verify the python version you are using? And the GLC build number, which you can get with gl.version_info.build_number.

Cheers! -Piotr


User 5308 | 6/20/2016, 6:31:20 PM

(dato-env)MACC02J81YNDKQ5:~ a1152802$ ipython Python 2.7.11 |Anaconda 4.0.0 (x86_64)| (default, Dec 6 2015, 18:57:58) Type "copyright", "credits" or "license" for more information.

IPython 4.1.2 -- An enhanced Interactive Python. ? -> Introduction and overview of IPython's features. %quickref -> Quick reference. help -> Python's own help system. object? -> Details about 'object', use 'object??' for extra details.

In [1]: import graphlab as gl

In [2]: gl.versioninfo.buildnumber
Out[2]: '995'


User 940 | 6/20/2016, 8:57:21 PM

@"Brennan Cleveland"

OK, so this is a bit of a weird error. Are you able to perform other operations, like construct an SFrame?

import graphlab sf = graphlab.SFrame()

Additionally, could you remove the ~/.graphlab directory and re-set the license key using graphlab.product_key.set_product_key()? It could be a corrupt configuration file.

Cheers! -Piotr


User 5308 | 6/20/2016, 9:28:32 PM

Piotr,

Re-setting the license key did the trick!

In [5]: songs = gl.SFrame.read_csv("http://s3.amazonaws.com/dato-datasets/millionsong/song_data.csv") Downloading http://s3.amazonaws.com/dato-datasets/millionsong/song_data.csv to /var/tmp/graphlab-a1152802/81641/cdde9087-71a7-4bf4-8795-693b7a42dd0c.csv

In [7]: songs.column_names() Out[7]: ['song_id', 'title', 'release', 'artist_name', 'year']


User 940 | 6/20/2016, 9:54:19 PM

@"Brennan Cleveland" ,

Apparently operations on the .config file are non-atomic, so if multiple processes (ie. multiple python session) are trying to operate on it, there could be some file corruption.

Anyways, thanks for your patience!

Cheers! -Piotr