Numpy-style array indexing of SArray?

User 1933 | 6/24/2015, 6:45:13 PM

In numpy, I can do array indexing. Here's a trivial example:

In [9]: x = np.random.randint(0,10,10)    
In [10]: x
Out[10]: array([7, 3, 0, 6, 1, 6, 3, 0, 5, 3])
In [11]: selection = [2,7,8]
In [12]: x[selection]
Out[12]: array([0, 0, 5])

Is there a way to achieve similar behavior with an SArray? Specifically, I have the output of an SFrame apply (a Nx1 SArray of floats) from which I need to select specific rows using an array of indices. The only solution I can think of seems really clunky, namely converting to an SFrame, adding an index column, doing a join, and then some column selection:

Let's assume I have an SArray containing the same values as "x" above:

In [34]: arr = gl.SArray(x)
In [35]: arr
Out[35]:
dtype: int
Rows: 10
[7, 3, 0, 6, 1, 6, 3, 0, 5, 3]

Then I'd have to do something like this:

In [40]: gl.SFrame({'vals':arr,'idx':range(len(arr))}).join(gl.SFrame({'idx':selection}))['vals']
Out[40]:
dtype: int
Rows: 3
[0, 0, 5]

To get the same output as above. I mean, this works, but it seems like a whole lot of extra overhead I would like avoid if possible. Another option is converting the SArray that I need to index to a numpy array, but that has proven prohibitively slow. Any ideas??

Comments

User 1933 | 6/24/2015, 7:24:36 PM

Alternatively, if there's a way to get an SFrame.apply to return a regular numpy array instead of an SArray, that would let me sidestep all of this...


User 2002 | 6/26/2015, 12:00:09 AM

Hi Jlorice, unfortunately SFrame's don't support multiple indexing, but we do support applying a logical filter which would accomplish the same thing:

In [40]: x = gl.SArray([7, 3, 0, 6, 1, 6, 3, 0, 5, 3]) In [41]: selection = gl.SArray([0, 0, 1, 0, 0, 0, 0, 1, 1, 0]) In [42]: x[selection] Out[42]: dtype: int Rows: ? [0, 0, 5, ... ] Additional methods of filtering are available in our documentation on SArray(). https://dato.com/products/create/docs/generated/graphlab.SArray.html

Hope this helps.

Regards, Punit