Unable to create SFrame from pandas.DataFrame on 1.5.2

User 2032 | 7/28/2015, 5:22:46 PM

To reproduce:

import numpy
import graphlab
import graphlab.numpy
from itertools import *
from functools import *
from pandas import DataFrame

k = 10 ** 6
p = 10 ** 2
df = DataFrame({'id': list(xrange(0,k)), 'a': sum(repeat(list(xrange(0, k/p)), p), []), 'b': list(repeat(1, k)) })

sa = graphlab.SFrame(df)

you will most likely get something like this

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-21-fd34e49968aa> in <module>()
----> 1 sa = graphlab.SFrame(df)
      2 n = graphlab.numpy.array(sa)

/home/production/.virtualenvs/tailor_core/lib/python2.7/site-packages/graphlab/data_structures/sframe.pyc in __init__(self, data, format, _proxy)
    846                     pass
    847                 else:
--> 848                     raise ValueError('Unknown input type: ' + format)
    849 
    850         sframe_size = -1

/home/production/.virtualenvs/tailor_core/lib/python2.7/site-packages/graphlab/cython/context.pyc in __exit__(self, exc_type, exc_value, traceback)
     47             if not self.show_cython_trace:
     48                 # To hide cython trace, we re-raise from here
---> 49                 raise exc_type(exc_value)
     50             else:
     51                 # To show the full trace, we do nothing and let exception propagate

TypeError: Unexpected data source. Possible data source types are: list, numpy.ndarray, pandas.Series, and string(url)

oddly enough the error does not list pandas.DataFrame as a possible data source, contrary to what one might read in the docs: https://dato.com/products/create/docs/generated/graphlab.SFrame.html

Comments

User 2032 | 7/28/2015, 5:35:23 PM

The reason was that kernel was not restarted after pandas installation - the import worked in notebook but gl was has not re-imported its dependencies and hence would fail on the HAS_PANDAS requirement. However the error reported is somehow misleading.


User 19 | 7/28/2015, 6:26:41 PM

Hi JohnnyM,

Thank you for reporting this! I agree that the error message does little to help point to the actual problem. I wasn't able to reproduce this with 1.5.2; can you confirm whether or not you still get this?

Thanks, Chris


User 2032 | 8/3/2015, 1:56:25 PM

I was not trying to reproduce it once fixed, but it happened on 1.5.2. To be honest recently I'm running into quite a lot of issues with GLC so I hardly have time to reproduce issues, most of the time I'm spending fixing them or identifying the cause. I even managed to crash the server to the point of hard reset with OOM that happened in flatmap. Expect some bug report later, once I have the time.