Incorrect area under curve(AUC) value for the ROC curve

User 2404 | 10/13/2015, 4:05:03 AM

I was working with a sentiment classifier and noticed that evaluate() function reports an incorrect value of AUC. To reproduce the bug, do the following steps: 1. Download the <a href="https://drive.google.com/file/d/0B0c0MbnP6Nn-WHZvYU10VGNrR00" target="blank">example dataset</a> and an <a href="https://drive.google.com/a/cs.washington.edu/file/d/0B0c0MbnP6Nn-MzZsc2pYejVkOTg" target="blank">iPython notebook</a>. 2. Extract data.gl/ from data.gl.zip 3. Place ROC bug reproduction.ipynb in the same directory as data.gl/ 4. Open ROC bug reproduction.ipynb in an iPython notebook session 5. Run all the commands

The attached iPython notebook (ROC bug reproduction.ipynb) first runs an evaluation on the given dataset using roc_curve metric. Then it manually computes the AUC using trapezoid rule.

Comments

User 940 | 10/13/2015, 5:23:14 PM

Hi @chohyu01 ,

Thanks for the bug report and the full repro. We will look into this, and keep you posted.

Cheers! -Piotr