Playing with 'class_weights' in Boosted Tress Classifier

User 1905 | 7/17/2015, 4:37:24 PM


I'm trying to adjust the way the model weighs each class in my dataset and I see that there is a parameter called 'class_weights' which allows me to set each weight individually.

The problem is that it's not clear what the 'unit' or 'metric' of this "weighing" is. The documentation says it takes a dictionary of values and that it "Weights the examples in the training data according to the given class weights." The problem is I don't know what values to provide for the class weights.

For example, should I be passing it a percentage of how to weight the class? (e.g. 20%, -15%)

Please let me know if you can help.



p.s. I believe this question also applies to SVM model and Logistic Classifier


User 18 | 7/18/2015, 1:13:11 AM

Hi @saarkagan,

The weights can be any positive numbers greater than 1e-20. So think about how you want to weigh the examples relative to each other. For example, you can set them to be between (0, 1]; anything less than 1.0 will be discounted.


User 1905 | 7/18/2015, 9:38:04 AM

Ok, so let's say these are my classes: Books, Beauty, Toys, Cell Phones, and Apparel.

And I want to discount the first 3 as follows: -30%, -20%, -10%

Would I build my dictionary as follows?

{'Books': 0.7, 'Beauty': 0.8, 'Toys': 0.9, 'Cell Phones': 1, 'Apparel': 1}

At the same time can I just leave off the ones that are unchanged?

{'Books': 0.7, 'Beauty': 0.8, 'Toys': 0.9}

User 18 | 7/19/2015, 11:09:14 PM

The first example is correct. You'll need to specify the weight for all classes in the custom class weights dictionary.

User 1905 | 7/24/2015, 2:25:32 PM

Thank you