Sigmoid unit :
Tanh unit:
Rectified linear unit (ReLU):
we call;

as stepped sigmoid

as softplus function
The softplus function can be approximated by max function (or hard max ) ie
. The max function is Continue Reading
Sigmoid unit :
Tanh unit:
Rectified linear unit (ReLU):
we call;
as stepped sigmoid
as softplus function
The softplus function can be approximated by max function (or hard max ) ie
. The max function is Continue Reading
I’ve gathered the following from online research so far:
I’ve used Armadillo a little bit, and found the interface to be intuitive enough, and it was easy to locate binary packages for Ubuntu (and I’m assuming other Linux distros). I haven’t compiled it from source, but my hope is that it wouldn’t be too difficult. It meets most of my design criteria, and uses dense linear algebra. It can call LAPACK or MKL routines.
I’ve heard good things about Eigen, but haven’t used it. It claims to be fast, uses templating, and supports dense linear algebra. It doesn’t have LAPACK or BLAS as a dependency, but appears to be able to do everything that LAPACK can do (plus some things LAPACK can’t). A lot of projects use Eigen, Continue Reading
Using Stochastic Gradient instead of Batch Gradient
Stochastic Gradient:
Batch Gradient:
Shuffling Examples
Transformation of Inputs
What is anomaly detection? It is the way of detecting a outlier data point among the other points that have a some kind of logical distribution. Outlier one is also anomalous point (Figure 1)
What are the applications?
Kmeans is the most primitive and easy to use clustering algorithm (also a Machine Learning algorithm).
There are 4 basic steps of Kmeans:
Caveats for Kmeans:
Here is the basic animation to show the intuition of Kmeans.
–Convexity, including convex optimization and formulation of problems as convex programs. Two important subsets of this are linear programming and proximal gradientstyle optimization algorithms and formulations, which have a ridiculously vast array of applications for industrial engineering and machine learning.
–Probabilistic modeling and inference: Graphical models and maxentropy models are the most important, and have a vast array of applications in machine learning and more structured statistical modeling. Markov Chain Monte Carlo is a terrific and amazing algorithm with a great special case called Gibbs sampling – they both present almost generic methods of Continue Reading