Open source sframe + scikit

User 3033 | 1/15/2016, 6:11:28 PM

I am trying to use open source SFrame (instead of pandas) with scikit-learn. I am unable to find an efficient way to pass input features to scikit-learn estimators. Here is an example. Consider a pandas dataframe, df, that holds features of house price data. To use linear regression, all that is needed is the following

input_features = df.values

Then, one can use inputfeatures as following: Xtrain, Xtest, Ytrain, Ytest = traintestsplit(inputfeatures, target, ...)

If instead I use sframe, I can't find any efficient way to convert an sframe into ndarray for scikit-learn estimators that require shape of nsamples X mfeatures for input features. Anyone done this?

Comments

User 16 | 1/16/2016, 8:14:32 PM

If you use the most recent version of SFrame (which only became available via pip yesterday) you can use the [tonumpy function](https://dato.com/products/create/docs/generated/graphlab.SFrame.tonumpy.html#graphlab.SFrame.to_numpy) to create an ndarray from an SFrame.

Let me know if you have any more questions. I'd also be interested to learn what you're doing with SFrames and Sckit Learn.