Can't use `numpy.int64` data type as index + wrong error message

User 1129 | 2/9/2015, 11:57:59 AM

I think that the following code should work without throwing exceptions

<code> sf = gl.SFrame({'type': ['cat', 'fossa', 'bat'], 'height': [15., 23.5, 19]})

line with the maximal height

ix = np.argmax(sf['height']) sf[ix] </code>

In addition, the error message says: "Invalid index type: must be SArray, list, or str", but it is obvious that integer index is also supported.

Comments

User 1129 | 2/9/2015, 12:01:45 PM

BTW: you test for types using <code> if type(key) is sometype</code>. IMHO, a better way would be <code>if isinstance(key, sometype)</code>, which will make it possible to use derived object (such as numpy.int64).

An even more drastic way of testing for variable types is not testing them at all, as suggested here: http://stackoverflow.com/a/154156/17523. I think that this is the best approach. But you know... you are the bosses.


User 954 | 2/9/2015, 6:49:11 PM

Hi bgbg,

We will fix this type checking in the new release. You are right. isinstance() should be used in this case. Btw, you can use graphlab argmax function as a workaround:

<pre><code>sf.groupby(keycolumns=[],operations={'maximumheight':gl.aggregate.ARGMAX('height','type')})</code></pre>


User 1129 | 2/12/2015, 6:19:24 AM

Nice to know. Although it is so much more verbose.


User 954 | 2/15/2015, 11:25:33 PM

I agree. we will make argmax/argmin api similar to numpy array for the next release.