Aug 24, 2015 | 0 comments Neural Network Loss and Activation Derivatives Related posts: Stochastic Gradient formula for different learning algorithms NegOut: Substitute for MaxOut units Paper review: CONVERGENT LEARNING: DO DIFFERENT NEURAL NETWORKS LEARN THE SAME REPRESENTATIONS? Why do we need better word representations ? deep learningmachine learningoptimization