CNN from data vs. reproduced CNN

User 5222 | 5/24/2016, 9:53:30 AM

Hi all, I'm a new member here, though I have been using Graphlab for more than half a year. Now, I'm doing research on deep learning and has found Dato a very muck like Matlab for prototyping net architecture, Right now, working on classifying traffic signs from the German benchmark, I've achieved the results of 90%. The model is chosen automatically by graphlab, and it looks like a very simple form of Lenet '98 to me. I'm trying to reproduce the results of 99% attained by Ciresan etal '11, the winner of the Traffic sign recognition challenge. I've made use of the layers in graphlab libraries and re-design the architecture exactly like what they did. But somehow, the accuracy is very poor, hardly pass the bar of 10% after hours of training I guessed the reason was a lack of ReLU layer after Conv layers, so I did add them. But the classification rate only improved slightly. I did the search and found someone here with the same problem but it hadn't been solved either So please give me your comments and hopefully someone will see through the cloud!

Comments

User 5159 | 5/24/2016, 6:44:19 PM

Could you provide your network structure and learning rate? 10% means your network fails to converge.


User 5222 | 5/25/2016, 8:26:29 AM

I design the network architecture according to this paper: (Ciresan et al 2011) http://people.idsia.ch/~ciresan/data/ijcnn2011.pdf The data had been rescaled to fit, just like the paper suggested. The learning rate was defaulted by Dato graphlab, which is .001, I guesses. Will look into it deeper to tell you Thanks for the answer


User 5222 | 5/26/2016, 4:56:04 AM

It's strange, Bing Even when I changed the Dato-net's first stage of Conv-Pool to replicate the IDSIA-net's first stage. The net still failed to learn even with the training images. In this case, the learning rate was fixed, only the hyper-parameters changed. I'm confused


User 5159 | 5/26/2016, 8:17:06 PM

There are 3 popular reasons for not converge: 1. Learning rate is too large 2. Bad initalization 3. Data is not shuffled and not normalized.

Please check it one by one.


User 5222 | 5/27/2016, 4:32:03 AM

I agree. But all those 3 is working well on the Dato net. And if I change only the structure of the first Conv-Pool stage, it would stop learning. In that case, all those 3 remain unchanged. Why?


User 5159 | 5/30/2016, 2:19:09 AM

It is very hard to help you without any more information. Could you provide exact your code of your network structure and parameters?