Terminated due to numerical difficulties

User 2594 | 11/21/2015, 3:33:37 AM

I'm getting an error message, and I'm having trouble finding a direct answer to what it means, or what to do about it. The error message is:

PROGRESS: Logistic regression: PROGRESS: -------------------------------------------------------- PROGRESS: Number of examples : 133448 PROGRESS: Number of classes : 2 PROGRESS: Number of feature columns : 1 PROGRESS: Number of unpacked features : 65744 PROGRESS: Number of coefficients : 65745 PROGRESS: Starting L-BFGS PROGRESS: -------------------------------------------------------- PROGRESS: +-----------+----------+-----------+--------------+-------------------+---------------------+ PROGRESS: | Iteration | Passes | Step size | Elapsed Time | Training-accuracy | Validation-accuracy | PROGRESS: +-----------+----------+-----------+--------------+-------------------+---------------------+ PROGRESS: | 1 | 7 | 0.000000 | 4.301178 | 0.841399 | 0.840049 | PROGRESS: | 2 | 10 | 5.000000 | 6.155005 | 0.887439 | 0.883107 | PROGRESS: | 3 | 11 | 5.000000 | 7.049176 | 0.923363 | 0.916196 | PROGRESS: +-----------+----------+-----------+--------------+-------------------+---------------------+ PROGRESS: TERMINATED: Terminated due to numerical difficulties. PROGRESS: This model may not be ideal. To improve it, consider doing one of the following: (a) Increasing the regularization. (b) Standardizing the input data. (c) Removing highly correlated features. (d) Removing `inf` and `NaN` values in the training data.

I'm relatively certain that there are no inf or NaN values in the training data. I'm sure there are highly-correlated features, but I don't yet know how to remove them. The input data is pretty standardized, it works if I prune out some of the values. And I have no idea what "increasing the regularization" means.

Can anyone help a poor ML newbie understand why his experiments at improving accuracy are being met by roadblocks such as this? :)

Comments

User 2594 | 11/21/2015, 12:26:32 PM

As a side-note, this started when I upgraded from 1.6.1 to 1.7.1


User 19 | 11/23/2015, 5:55:26 PM

Hi John,

Sorry that the new error message was a bit unclear. I would suggest following suggestion (a) by increating including an l2_penalty argument, perhaps setting it to 10.0, and trying various smaller values until you have good validation-accuracy.

The point is that you have a large number of features for the number of examples you have, so you need to constrain your model (e.g. using the l2_penalty argument) in order for the optimization to be better behaved.

Let me know if that helps! Chris


User 2594 | 12/5/2015, 8:44:16 PM

Oh, neat! That's new, obviously. Nice, since in my experiments, I am working to pair down the number of features... That message tells me that I didn't do it right (yet)