Posts tagged with: machine learning

What Are Hot Topics for The Current Era of Machine Learning

1. Deep learning [5] seems to be getting the most press right now. It is a form of a Neural Network (with many neurons/layers). Articles are currently being published in the New Yorker [1] and the New York Times[2] on Deep Learning.

2. Combining Support Vector Machines (SVMs) and Stochastic Gradient Decent (SGD) is also interesting. SVMs are really interesting and useful because you can use the kernel trick [10] to transform your Continue Reading


Some possible Matrix Algebra libraries based on C/C++

I’ve gathered the following from online research so far:

I’ve used Armadillo a little bit, and found the interface to be intuitive enough, and it was easy to locate binary packages for Ubuntu (and I’m assuming other Linux distros). I haven’t compiled it from source, but my hope is that it wouldn’t be too difficult. It meets most of my design criteria, and uses dense linear algebra. It can call LAPACK or MKL routines.

I’ve heard good things about Eigen, but haven’t used it. It claims to be fast, uses templating, and supports dense linear algebra. It doesn’t have LAPACK or BLAS as a dependency, but appears to be able to do everything that LAPACK can do (plus some things LAPACK can’t). A lot of projects use Eigen, Continue Reading


Some possible ways to faster Neural Network Backpropagation Learning #1

Using Stochastic Gradient instead of Batch Gradient
Stochastic Gradient:

  • faster
  • more suitable to track changes in each step
  • often results with better solution – it may finds different ways to different local minimums on cost function due to it fluctuation on weights –
  • Most common way to implement NN learning.

Batch Gradient:

  • Analytically more tractable for the way of its convergence
  • Many acceleration techniques are suited to Batch L.
  • More accurate convergence to local min. – again because of the fluctuation on weights in Stochastic method –

Shuffling Examples

  • give the more informative instance to algorithm next as the learning step is going further – more informative instance means causing more cost or being unseen –
  • Do not give successively instances from same class.

Transformation of Inputs

  • Mean normalization of input variables around zero mean
  • Scale input variables so that covariances are about the same unit length
  • Diminish correlations between features as much as possible – since two correlated input may result to learn same function by different units that is redundant –

Anomaly detection and a simple algorithm with probabilistic approach.

What is anomaly detection? It is the way of detecting a outlier data point among the other points that have a some kind of logical distribution.  Outlier one is also anomalous point (Figure 1)

Figure 1

What are the applications?

  •  Fraud user activity detection – it is a way of detecting hacker activities on web applications or network connections by considering varying attributes of the present status. For example , an application can keep track of the user’s inputs to website and the work load that he proposes to system. Considering these current attribute values detection system decide a particular fraud action and kick out the user if there is.
  • Data center monitoring You might be governing a data center with vast amount of computers so it is really hard to check each computer regularly for any flaw. A anomaly detection system might be working by considering network connection parameters of the computers, CPU and Memory Loads, it detect any problem on computer. Continue Reading

How K-means clustering works

K-means is the most primitive and easy to use clustering algorithm (also a Machine Learning algorithm).
There are 4 basic steps of K-means:

  1. Choose K different initial data points on instance space (as initial centroids) – centroid is the mean points of the clusters that overview the attributes of the classes-.
  2. Assign each object to the nearest centroid.
  3. After all the object are assigned, recalculate the centroids by taking the averages of the current classes (clusters)
  4. Do 2-3 until centroid are stabilized.

Caveats for K-means:

  • Although it can be proved that the procedure will always terminate, the k-means algorithm does not necessarily find the most optimal configuration, corresponding to the global objective function minimum.
  • The algorithm is also significantly sensitive to the initial randomly selected cluster centres. The k-means algorithm can be run multiple times to reduce this effect.

Here is the basic animation to show the intuition of K-means.