Posts tagged with: prelu

Microsoft Research introduced a new NN model that beats Google and the others

MS researcher recently introduced a new deep ( indeed very deep šŸ™‚ ) NN model (PReLU Net) [1]Ā and they push the state of art in ImageNet 2012 dataset from 6.66% (GoogLeNet) to 4.94% top-5 error rate.

In this work, they introduceĀ an alternation of well-known ReLU activation function. They call itĀ PReLu (Parametric Rectifier Linear Unit). The idea behind is to allow negative activations on the ReLU function with a controlĀ parameter


which is also learned over the training phase. Therefore, PReLU allows negative activations and in the paper they argue and emprically show that PReLU isĀ better to resolve diminishing gradient problem for very deep neural networks Ā (> 13 layers) due to allowance of negative activations. That means more activations per layer, hence more gradient feedback at the backpropagation stage.


all figures are from the paper

Continue Reading