List columns in composite distances

User 2077 | 8/18/2015, 4:15:40 AM

It seems strange to me that you can't use list columns automatically in a composite distance function. What's the recommended way of including them, given that it can't be done directly? Just convert them to a dictionary?

`python sf = gl.SFrame({"id": range(5), "categories": [["taxes", "financial"], ["financial", "home"], ["health", "entertainment"], ["entertainment", "music", "movies"], ["health", "medical"]]})

distance = [(("categories",), "jaccard", 1),]

gl.nearest_neighbors.create(sf, distance)

-- End pasted text --


TypeError Traceback (most recent call last) <ipython-input-294-78e8dae9c5de> in <module>() 3 distance = [(("categories",), "jaccard", 1),] 4 ----> 5 gl.nearest_neighbors.create(sf, distance)

/Users/rlvoyer/Envs/bgg/lib/python2.7/site-packages/graphlab/toolkits/nearestneighbors/nearestneighbors.pyc in create(dataset, label, features, distance, method, verbose, **kwargs) 337 338 ## Initial validation and processing of the label --> 339 dataset, label = tkutl.validaterowlabel(dataset, label=label) 340 reflabels = _dataset[_label] 341

/Users/rlvoyer/Envs/bgg/lib/python2.7/site-packages/graphlab/toolkits/internalutils.pyc in validaterowlabel(dataset, label, defaultlabel) 513 ## Validate the label name and types. 514 if not isinstance(label, str): --> 515 raise TypeError("The row label column name '{}' must be a string.".format(label)) 516 517 if not label in dataset.column_names():

TypeError: The row label column name '[(('categories',), 'jaccard', 1)]' must be a string. `

Comments

User 15 | 8/18/2015, 5:17:57 PM

Hey Robert,

The error you posted has to do with passing the list you named "distance" as the "label" parameter. When I correct for that though, I get the error " Feature 'categories' not of type integer, float, dictionary, vector, or string." Is this the issue you're referring to? Indeed, when I convert the lists to dictionaries, the model creates successfully. Brian is until later today...I'll ask him why list isn't accepted when he gets in.

Evan