SFrame.unpack() doesn't act as expected: turns my unicode strings into a list of characters

User 2879 | 1/9/2016, 11:03:37 PM

Hey everyone!

i'm creating an SFrame out of a list of dicts containing unicode strings like [{"title": u"battle of the bands", "venue": "The Roxy"},{"title": "comedy night", "venue": u"The Roxy"}] I get all the dicts in one column then use SFrame.unpack() to get each feature in the dict to be its own column.

However, when I do that I do not get unicode strings as they were in the dict, but something lie ["T", "h", "e", " ", "R", "o", ... ]. Is this how unpack is supposed to work? Unpack works fine if I convert them to byte strings before unpacking, but I'd like to know why unpack is doing this?

Comments

User 1190 | 1/13/2016, 7:28:05 PM

Hi,

The following code seems to work as expected (tried with latest GLC 1.7 and previous version 1.6)

`

sa = gl.SArray([{"title": u"battle of the bands", "venue": "The Roxy"},{"title": "comedy night", "venue": u"The Roxy"}]) sa.unpack() +---------------------+----------+ | X.title | X.venue | +---------------------+----------+ | battle of the bands | The Roxy | | comedy night | The Roxy | +---------------------+----------+ `

`

sf = gl.SFrame([{"title": u"battle of the bands", "venue": "The Roxy"},{"title": "comedy night", "venue": u"The Roxy"}]) sf.unpack('X1') +---------------------+----------+ | X1.title | X1.venue | +---------------------+----------+ | battle of the bands | The Roxy | | comedy night | The Roxy | +---------------------+----------+ `

-jay