Feature Rescaling for Test dataset

User 761 | 3/17/2015, 6:22:15 AM

Hello again! I had a question regarding Feature Rescaling in the classifiers

Say I feed a training set into a classifier with features rescaling turned on. According to the documentation its an L2 normalization and "The coefficients are returned in original scale of the problem". What does this mean exactly? Is the l2-norm of the feature incorporated in the feature coefficients? Suppose I have a logistic regression model. I take all the feature coefficients and want to use them outside Graphlab for new incoming data. Can I use the feature coefficients directly on the new data or must I scale the incoming data first? If I can use them directly that's great. Otherwise where can I find the feature norms calculated by graphlab to modify the new data before applying the coefficients to it?

Thanks!

Comments

User 19 | 3/17/2015, 6:37:35 AM

Hi,

You may use the coefficients directly on the new data.

We scale the data prior to training; the returned coefficients have been rescaled accordingly (so that they are appropriate with respect to the original, unscaled dataset).

We do not yet return the feature norms used during training. I agree it might have been more clear if we had returned both scaled/unscaled coefficients.

Please let us know if you have any other questions, or if you run into any issues! Chris


User 761 | 3/17/2015, 8:38:45 AM

Thanks Chris! That clears it up.