DeepLearning model classify everything as the same class

User 5157 | 4/27/2016, 1:48:17 PM

Hi, I have a dataset of 11120 examples, each has a classification- 1 or 2. 5560 examples for each class. I've tried working with deep learning but couldn't get good results. I create and train the model using this code:

shfld_data = gl.cross_validation.shuffle(data_set)
net = gl.deeplearning.create(shfld_data, 'label')
self.network_model = gl.neuralnet_classifier.create(shfld_data, 'label', network=net)

Then I test the model on other examples (the test examples) using this code:

shfld_data = gl.cross_validation.shuffle(test_data)
pred = self.network_model.classify(shfld_data)
results = self.network_model.evaluate(shfld_data)
print results

Unfortunately the results are:

+-----------+--------------+------+ | targetlabel | predictedlabel | count | +-----------+--------------+------+ |------1-----|-------1--------|-1113-| |------2-----|-------1--------|-1112-| +-----------+--------------+------+ [2 rows x 3 columns] , 'accuracy': 0.5002247095108032}

I don’t understand why it does not predict any example as class 2. Any help would be greatly appreciated!

Comments

User 5159 | 4/28/2016, 5:25:10 PM

Hi @yuh, there are many reason, for example, improper learning rate, improper normalization on input data and so on. Could you provide mean and variance of your data?


User 5157 | 5/2/2016, 9:50:35 PM

Hi @Bing, thanks for the reply. The mean is: -0.00180829 and the variance is: 4.5483e-09. There is no normalization. I've tried to normalize with new min as 0.5 and new max as 1.0 but it didn't fix the problem. My data is a list of newspaper articles, each article represented as a vector of 300 features. I’ve created the vectors using word2vec.


User 5159 | 5/11/2016, 4:53:15 PM

Hi @yuh ,

I suggest you try to add BatchNorm after data. If the variance (per feature column) is correct, maybe your word2vec result is not correct, because the variance looks too small.