How to create sparse matrix based on a categorical variable?

User 2788 | 12/16/2015, 6:57:34 AM

I want to achieve get_dummies function in dataframe using Graphlab Create and SFrame

Here is my original data set I want to create a sparse matrix based on the original data set, the result should be like this,

I tried the code below, ` Fruit=temp['Fruit'].unique() for i in range(len(Fruit)): temp[Fruit[i]]=0

def assignment(x): x[x['Fruit']]=1

temp.apply(assignment)`

But the corresponding cell is still 0.

Comments

User 1190 | 12/16/2015, 6:17:02 PM

Please use the one hot encoding from feature engineering module:

https://dato.com/learn/userguide/feature-engineering/onehotencoder.html

If you want a truly sparse matrix, you could use SFrame.unpack() to unpack the dictionary column. However, it is not recommended since SFrame is not designed for handling large numbers of PHYSICAL columns, sparse data or dense array data should be packed into a single column of dict or array type.

Thanks -jay


User 2788 | 12/16/2015, 6:32:27 PM

Thank you for your answer, but the column is not a dictionary column.


User 1190 | 12/16/2015, 6:44:39 PM

The dict columns is the result of using onehotencoder.