Sep 1, 2015 | 0 comments Stochastic Gradient formula for different learning algorithms Thanks to http://research.microsoft.com/pubs/192769/tricks-2012.pdf Related posts: NegOut: Substitute for MaxOut units Paper Review: Do Deep Convolutional Nets Really Need to be Deep (Or Even Convolutional)? Paper review: CONVERGENT LEARNING: DO DIFFERENT NEURAL NETWORKS LEARN THE SAME REPRESENTATIONS? Why do we need better word representations ? deep learninggradient descentmachine learningoptimization