User 1843 | 4/29/2015, 6:00:34 PM

Hi,

Can someone share some insight on how GraphLab calculates the predictions for Gradient Boosting Trees Classifier in terms of probability? In one of the Gradient Boosting Trees Classifier models I created, I used 8 trees, and I tried one sample for prediction. Based on the JSON trees, the leaf node value that the sample fired are:
** -0.536623
-0.346335
-0.003302
-0.206162
0.114814
0.096973
-0.514708
-0.152272**

And when I used the model to predict this single sample with probability output type, the result is **0.17543102667164928**. Can someone kindly explain how that result is calculated based on the scores I listed above? Apparently summing or averaging those scores didn't give me that result.

BTW, I used class weight {0:1, 1:12}, step size 0.3, max*depth 6, min*child_weight 0.1 in my training. Not sure if they are used in the predictions

Thanks

User 1190 | 5/4/2015, 6:07:49 PM

Hi @Bruce_Yang,

The leaf nodes stores weights, which is transformed into probability via logistic function. If you only construct one tree, and observed weight "w" at a particular leaf node, the probability should be 1 / (1 + exp(-w)).

Using boosted trees, the weights are combined and transformed to the probability: 1 / (1 + exp(-(w0 + w1 + w2 + ...)).

In your case let W = -0.536623 + (-0.346335) + ..., you should be able to get back prob=1/(1+exp(-W))=0.175

User 3066 | 1/19/2016, 2:39:32 AM

It's clear to me how the probability is calculated. However, this is the probability of belonging to which class? How does it decide, based on this probability, the class?