graphlab.logistic_classifier parameter Target

User 2465 | 10/23/2015, 7:08:33 PM

Hi,

I have a set of words that were assigned to a variable called selected_words... here the code:

selected_words = ['awesome', 'great', 'fantastic', 'amazing', 'love', 'horrible', 'bad', 'terrible', 'awful', 'wow', 'hate']

I want to perform a logistic classifier on a field called 'wordcount' based on the set of words that are contained on the selectedwords variable. I am using the following code and I am getting an error... selected_words_model = graphlab.logistic_classifier.create(train_data, target=selected_words, features=selected_words, validation_set=test_data) My problem is that all the examples that I have seen related to the logistic_classifier is only one column as a target. As you can see, I have a set of columns as a target.

Any ideas?

Thank you very much.

Comments

User 1592 | 10/24/2015, 12:12:46 PM

Hi It seems you have some confusion about the way a logistic classifier works. The target should be a column which includes a binary value either 0 or 1 which signifies the class of the row either negative or positive. The features could be several columns that are used as features to learn. At anyway the target could not be equal to features. The goal of the logistic classifier is to find a set of weights for the feature columns that together with a logistic function predicts accurately the target.


User 2429 | 10/26/2015, 5:19:21 AM


User 1592 | 10/26/2015, 5:42:31 AM

Not sure I understand - please give an example of what i the target. Thanks


User 3471 | 3/11/2016, 11:16:40 AM

Dear D Bickson, I have a logistic regression classiffier with some columns in the features. One of the colums have only zeros, and it returns an error: it does not recognize the column it says missing column. Why it does not recongnize when one of the fetaure columns has only zeros? Kind regards, Macilane


User 15 | 3/11/2016, 10:59:49 PM

Hi @macilane,

I tried to reproduce what you describe, but I could not. A column of all zeros was still used by the logistic_classifier. Are you sure you got the column name right?

Evan