Posts tagged with: machine learning

Large data really helps for Object Detection ?

I stumbled upon a interesting BMVC 2012 paper (Do We Need More Training Data or Better Models for Object Detection? — Zhu, Xiangxin, Vondrick, Carl, Ramanan, Deva, Fowlkes, Charless). It is claming something contrary to current notion of big data theory that advocates benefit of large data-sets so as to learn better models with increasing training data size. Nevertheless, the paper states that large training data is not that much helpful for learning better models, indeed more data is maleficent without careful tuning of your system !! Continue Reading


How does Feature Extraction work on Images?

Here I share enhanced version of one of my Quora answer to a similar question …

There is no single answer for this question since there are many diverse set of methods to extract feature from an image.

First, what is called feature? “a distinctive attribute or aspect of something.” so the thing is to have some set of values for a particular instance that diverse that instance from the counterparts. In the field of images, features might be raw pixels for simple problems like digit recognition of well-known Mnist dataset. However, in natural images, usage of simple image pixels are not descriptive enough. Instead there are two main steam to follow. One is to use hand engineered feature extraction methods (e.g. SIFT, VLAD, HOG, GIST, LBP) and the another stream is to learn features that are discriminative in the given context (i.e. Sparse Coding, Auto Encoders, Restricted Boltzmann Machines, PCA, ICA, K-means). Note that second alternative, Continue Reading


A Large set of Machine Learning Resources for Beginners to Mavens

Note : I regularly update this list.

 Machine Learning 101:

I. Introduction to Machine Learning

II.  Linear Regression

Continue Reading


Best way to qualify your machine learning model.

Selection of your final machine learning model is a vital part of your project. Using the accurate metric and the selection paradigm might give very good results even you use very simple or even wrong learning algorithm. Here, I explain a very parsimonious and plane way.

The metric you choose is depended to your problem end expectations. Some common alternatives are F1 score (combination of precision and recall), accuracy (ratio of correctly classified instances to all instances), ROC curve or error rate (1-accuracy).

For being an example I use error rate (at the below figure). First divide the data into 3 as train set, held-out set, test set. We will use held-out set as an objective guidance of hyper-parameters of your algorithm. You might also prefer to use K-fold X-validation but my choice is to keep a held-out set, if I have enough number of instances.

Following procedure can be used for parameter selection and the selection of the final model. The idea is, plotting the performance of the model with the lines of test fold accuracy (held-out set) and the train fold accuracy. This plot should be met at a certain point where both of the curves consistent in some sense (training fold and test fold scores are at reasonable levels) and after a slight step they start to be stray away from each other (train fold score increases still and test fold score starts to be dropped down). This straying effect might be underfitting or after a numerous learning iterations likely to be overfitting.  Choice the best trade-off point on the plot as the correct model.


Example with error rate so not confused by the decreasing values so lower is better in that sense. The signed point is the saturation point where the data starts to over-fit.

Another caveat, do not use so much folds for x-validation since some of the papers (that cannot come up the name right now:( ), asymptotic behaviour of cross validation is likely to tout over-fitting therefore use of leave-multiple out procedure instead of leave-one out if you propose to use large fold number.


Randomness and RandomForests

One of the enhancing use case of randomness subjected to machine learning is Random Forests. If you are familiar with Decision Tree that is used inasmuch as vast amount of data analysis and machine learning problems, Random Forests is simple to grasp.

For the beginners, decision tree is a simple, deterministic data structure for modelling decision rules for a specific classification problem (Theoretically shortest possible message length in Information jargon). At each node, one feature is selected to make instance separating decision. That is, we select the feature that separates instances to classes with the best possible “purity”. This “purity” is measured by  entropy, gini index or information gain. As lowing to the leaves , tree is branching to disperse the different class of instance to different root to leaf paths.  Therefore, at the leaves of the tree we are able to classify the items to the classes. Continue Reading


Kohonen Learning Procedure K-Means vs Lloyd's K-means

K-means maybe the most common data quantization method, used widely for many different domain of problems. Even it relies on very simple idea, it proposes satisfying results in a computationally efficient environment.

Underneath of the formula of K-means optimization, the objective is to minimize the distance between data points to its closest centroid (cluster center). Here we can write the objective as;

    \[argmin sum_{i=1}^{k}sum_{x_j in S_i} ||x_j - mu_i||^2\]


is the closest centroid to instance



Continue Reading