Binary Classification (probabilities/class labels)

User 984 | 11/25/2014, 6:39:29 PM

I have a binary classification problem with two classes, 0 and 1. If I am interested in just the probability of success (i.e. of seeing a 1), is that what is returned when I call <code class="CodeInline">predict</code>? The functionality seems to be well defined for the <code class="CodeInline">LogisticClassifier</code> but less so for the <code class="CodeInline">BoostedTreesClassifier</code>.

I would also be interested in being able to predict the probability of a 1 using an SVM and neural network, but the functionality doesn't seem to be there. Is there a hacky way to get the results of a neuralnet before they are passed through the softmax layer?

Comments

User 91 | 11/25/2014, 7:00:58 PM

For SVM, there is no natural interpretation of probability. The output of the SVM has to be calibrated. The most popular way to calibrate is described here: http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.41.1639

It is something that we will incorporate out of the box in a future release. But for now, you could implement the same easily. Another suggestion is to simply pass the margin through a logistic function to make it between 0 and 1. That isn't the best way to do it but it is a hacky way out.