This is a technical blog, to share, encourage and educate everyone to learn new technologies. I am a Machine Learning Engineer. I have worked with several Machine learning algorithms. My past work included research on NLP, Image and Video Processing, Human Computer Interaction and I developed several algorithms in this area while working in Computer Architecture and Parallel Processing lab of..
Hi there, last few blogs were hardcore machine learning and AI. Today let’s learn something interesting, lets do some magic using computer vision. I hope you all know about Harry Potter’s ‘invisible cloak’, the one he uses to become invisible. We will see how we can do the same magic trick with the help of computer vision. I will code with python and use the opencv library.
Below is the video for your reference:
The algorithm is very simple, we will separate the foreground and background image with segmentation. And then remove the foreground object from every frame. We are using a red coloured cloth as foreground image; you can use any other color of your choice but need to tweak the code accordingly. We will use the following steps:
Import necessary libraries, create output video
Capture and store the background for every frame.
Detect the red coloured part in every frame.
Segment out the red coloured part with a mask image.
Generate the final magical output.
Step1: Import necessary libraries, create output video
Import the libraries. OpenCV is a library of programming functions mainly aimed at real-time computer vision. NumPy is the fundamental package for scientific computing with Python. In machine learning as we need to deal with a huge amount of data, we use NumPy, which is faster than normal array. Prepare for the output video.
Step2: Capture and store the background for every frame
The main idea is to replace the current frames’ red pixels with background pixels to generate the invisible effect. To do that first we need to store the background image for every frame.
cap.read() method is used to capture the current frame and stores the variables in ‘background’. The method also returns a Boolean True/False store in ret, if the frame is read correctly it returns Trues else false.
We are capturing the background in a for loop, so that we have several frames for background as averaging over multiple frames also reduces noise.
Step3: Detect the red coloured part in every frame
Now we will focus on detecting the red part of the image. As RGB (Red-Green-Blue) values are highly sensitive to illumination we will convert the RGB image to HSV (Hue – Saturation – Value) space. After we convert the frame to HSV space we will specify, some specific color range to detect the red color.
In general, the Hue values are distributed over a circle ranging between 0-360 degrees, but in OpenCV the range is from 0-180. And the red colour is represented by 0-30 as well as 150-180 values. We use the range 0-10 and 170-180 to avoid detection of skin as red. And then combine the masks with a OR operator(for python + is used).
Step4: Segment out the red coloured part with a mask image
Now that we where the red part is in the frame from the mask image, we will use this mask to segment that part from the whole frame. We will do a morphology open and dilation for that.
Step5: Generate the final magical output
Finally, we will replace the pixels of the detected red coloured region with corresponding pixel values of the static background, which we saved earlier and finally generate the output which creates the magical effect.
So now you can create your own video with invisible cloak. You can download the running python code from here: full code
Hope you enjoyed the magical aspect of computer vision. Do let me know your feedback and suggestion in the comment below. Thank you
Hello all, I hope from last few posts you already have good theoretical concept about the Machine Learning Algorithms. Today, we will do a Simple Linear Regression implementation with python. It won’t take much time and I will try to explain every step with simple words.
It is called Simple Linear Regression as it considers only one feature of input data and make the prediction. For example, here we will consider a housing price data set. As it is Simple Regression, it will only consider the size of the house to predict the price of it. But Multiple Regression, to predict the price it may consider several features such as locality, Front/back facing house etc. Below is the input data which we will use for the prediction, here house_size(x) is the input ranging from 1k sqr meter to 14k sqr meter and price(y) of the house ranging from 300 to 1100 dollar.
A scattered plot of the housing data looks like this:
Now we must find a line, which fits this scattered plot known as Regression line, so that we can predict house price for any given size(x). The equation for the Regression line looks like this
- h(x_ith)= B0 + B1*x_ith
where, h(x_ith) represents prediction for x_ith and B0,B1 are the regression coefficients. To make the prediction, we need to estimate the regression coefficients (B0, B1). For implementation we need to follow the below steps:
Step1: Import the libraries. NumPy is the fundamental package for scientific computing with Python. In machine learning as we need to deal with a huge amount of data we use NumPy, which is faster than normal array. Matplotlib is a plotting libraryin python, we will use it for visualization.
Step2: Take the mean of the house_size(x) and the price(y). Calculate cross-deviation and deviation by calculating Sum of Squared Errors.
Step3: Calculate regression coefficients or the prediction error(explained in previous block here: )
Step4: Plot the scattered points on the graph with red colors. The x-axis represents the size of the house(house_size) and the y-axis represents the price. (figure above)
Step5: Predict the regression line with minimum error and plot it with purple color.
Step6: Lastly, write the main and call the main function. And the final output of the code is
b_0 = 295.95147839272175
b_1 = 57.31614859742229
And the graph should look like this-
You can download the full code(linearRegression.py) from github here: source code
Hope you enjoyed today’s post. Stay tuned for more python implementation. Do let me know your feedbacks and comments below.
In machine learning, there’s something called the “No Free Lunch” theorem, which means no algorithm performs best for every problem. So, you need to figure out which algorithm is best for your problem with the available data set. In today’s blog I will focus on 10 most commonly used machine learning algorithms. As we are going to learn 10 different algorithms in this post, it will be little longer than usual, but have patient I will try to make it as simple as possible. So, let’s get started~
- Linear Regression is supervised learning as you may remember from our last lession, regression is supervised machine learning algorithm. Linear Regression is a model that assumes a linear relationship between the input variables (x) and the single output variable (y) and can predict the output. The representation of linear regression is an equation that describes a line that best fits the relationship between the input variables (x) and the output variables (y), by finding specific weightings for the input variables called coefficients (B). For example: y = B0 + B1 * x. Example: We will consider the same regression example here(figure below), if we have a data set of house prices with respect to house size, it can predict an unknown house price(q), if given the house size(P).
- Some good rules of thumb when using this technique are to remove variables that are very similar (correlated) and to remove noise from your data, if possible. It is a fast and simple technique and good first algorithm to try.
- Logistic regression is like linear regression, but instead of fitting a straight line or hyperplane, the prediction for the output is transformed using a non-linear function called the logistic function or sigmoid function. The function looks like a big S and transforms any output to 0 to 1 range. For your reference please see the below figure(taken from wiki https://en.wikipedia.org/wiki/Logistic_regression#/media/File:Exam_pass_logistic_curve.jpeg)
- Like linear regression, logistic regression does work better when you remove attributes that are unrelated to the output variable as well as attributes that are very similar (correlated) to each other.
Linear discriminate Analysis
- It consists of statistical properties of your data, calculated for each class. For a single input variable this includes: The mean value for each class, The variance calculated across all classes. Predictions are made by calculating a discriminate value for each class and making a prediction for the class with the largest value
- so it is a good idea to remove outliers from your data beforehand. It’s a simple and powerful method for classification predictive modelling problems.
Classification and Regression Trees or Decision Trees
- Decision trees are important type of algorithm for predicting models. Each node represents a single input variable(x) and a split point on that variable. The leaf node of the tree contains an output (y) and the prediction for the model. Predictions are made by walking the splits of the tree until arriving at a leaf node and output the class value at that leaf node.
Naïve Bais Algorithm
- The definition of Bayes theorem is- P(A|B)=P(B|A)P(A)/P(B), where A,B are events and P(A|B)- is a conditional probability: the likelihood of event A occurring given that B is true. P(A) and P(B) are the probabilities of observing A and B independently of each other; this is known as the marginal probability.
- Naive Bayes is called ‘naïve’ because it assumes that each input variable is independent. This is a strong assumption and unrealistic for real data, nevertheless, the technique is very effective on a large range of complex problems.
- The model is consist of to two types of probabilities that can be calculated directly from the training data. They are - A. probability of each class, B. Conditional probability of each class given each x value. Once calculated, the probability model can be used to make predictions for new data using Bayes Theorem.
K-NN(K Nearest Neighbor) algorithm
- K nearest neighbors algorithm is a simple procedure to store all available cases and classifies new cases based on a similarity measure. It is a simple, easy-to-implement supervised machine learning algorithm which can be used for both classification and regression algorithms. Predictions are made for a new data point after searching through the entire training set for the K most similar neighbors and by summarizing the output variable for those K instances. The idea of distance or closeness with neighbors can be break down in very high dimensions (lots of input variables) and that also can negatively affect the performance of the algorithm. This is called the curse of dimensionality. Which means you only use those input variables that are most relevant to predicting the output variable.
- As this algorithm is frequently used and easy to implement, I will try to explain it with the following diagrams and data set. Suppose, we have a data set with two groups, group A(blue) and group B(yellow) as shown in the figure below and we want to classify the unknown point p1(red). Do to so, the algorithm will try and find 4 nearest distanced neighbour(as k=4) for the point p1 and label them accordingly.
Learning Vector Quantization(LVQ)
- A downside of K-Nearest Neighbors is that you need to hang on to your entire training dataset. The Learning Vector Quantization algorithm (or LVQ for short) is an artificial neural network algorithm that allows you to choose how many training instances to hang onto and learns exactly what those instances should look like.
- The representation for LVQ is a collection of codebook vectors. These are selected randomly in the beginning and adapted to best summarize the training dataset over a number of iterations of the learning algorithm. After learned, the codebook vectors can be used to make predictions just like K-Nearest Neighbors. The most similar neighbor (best matching codebook vector) is found by calculating the distance between each codebook vector and the new data instance. The class value or (real value in the case of regression) for the best matching unit is then returned as the prediction.
Support Vector Machine(SVM)
- Support Vector Machine” (SVM) is a supervised machine learning algorithm. This is another algorithm which can be used for both classification and regression problems. However, it is mostly used in classification problems. In this algorithm, we plot each data item as a point in n-dimensional space (where n is number of features we have) with the value of each feature being the value of a particular coordinate. Then, we perform classification by finding the hyper-plane that differentiate the two classes very.In SVM, the hyperplane is selected input points that to best separates the input variables into the points in the input variable space by their class, either class 0 or class 1. In two-dimensions, you can visualize this as a line and let’s assume that all of our input points can be completely separated by this line. The SVM learning algorithm finds the coefficients that results in the best separation of the classes by the hyperplane.
- The best or optimal hyperplane that can separate the two classes is the line that has the largest margin. Only these points are relevant in defining the hyperplane and in the construction of the classifier. These points are called the support vectors. They support or define the hyperplane. In practice, an optimization algorithm is used to find the values for the coefficients that maximizes the margin.
Bagging and random forest
- The bootstrap is a powerful statistical method for estimating a quantity from a data sample. Such as a mean. You take lots of samples of your data, calculate the mean and then average all of your mean values to give you a better estimation of the true mean value.
- In bagging, the same approach is used, but instead for estimating entire statistical models, most commonly decision trees. Multiple samples of your training data are taken then models are constructed for each data sample. When you need to make a prediction for new data, each model makes a prediction and the predictions are averaged to give a better estimate of the true output value.
- Random forest is a tweak on this approach where decision trees are created so that rather than selecting optimal split points, sub-optimal splits are made by introducing randomness.
- Boosting is an ensemble technique that attempts to create a strong classifier from a number of weak classifiers. This is done by building a model from the training data, then creating a second model that attempts to correct the errors from the first model. Models are added until the training set is predicted perfectly or a maximum number of models are added.
- AdaBoost is used with short decision trees. After the first tree is created, the performance of the tree on each training instance is used to weight how much attention the next tree that is created should pay attention to each training instance. Training data that is hard to predict is given more weight, whereas easy to predict instances are given less weight. Models are created sequentially one after the other, each updating the weights on the training instances that affect the learning performed by the next tree in the sequence. After all the trees are built, predictions are made for new data, and the performance of each tree is weighted by how accurate it was on training data.
Congratulation guys, now you know the 10 most commonly used machine learning algorithms. Next post I am planning to write some commonly asked interview questions on machine learning algorithms. So stay tuned, will share the next link soon. And don't forget to comment below for any suggestion and feedback. Till then bye, see you soon.
Hello Folks, if you read my previous three posts on Artificial Intelligence (AI), then congratulations you have the basic knowledge about the Machine Learning algorithms if not please read them. Today I would like to discuss about some most commonly used interview question on the field of Machine Learning and AI. Which would help you crack your interviews in machine Learning. Most of the basic things are already covered, remaining we will learn here.
Let’s get started
What is Gradient Decent?
- Gradient decent is an optimization algorithm which minimizes any given function. Given a function Gradient decent starts with an initial set of parameters and iteratively move to the set of parameters which provides minimum for that particular function. It is little difficult to visualize; I will try to give an example with figures for better understanding. - In the above figure the blue dots are actual house prices(y_Actual) corroding to the house size, green line is the predicted house price(y_Prediction) and yellow dotted lines are prediction errors (prediction error= y_Prediction - y_Actual). So, the aim is to improve the prediction by minimizing the prediction error (y_Predict - y_Actual). Gradient decent is the algorithm which is used to minimize the prediction error and optimize the function.
What are the differences between Random forest and Gradient boosting? Or explain the difference between bagging and boosting algorithms.
The difference between Random Forest and Gradient boosting is as follows-
- Randam forest uses bagging and samples randomly, whereas gradient boosting uses bagging, boosting samples with an increased weight on the ones that it got wrong previously
- Because all the trees in random forest are built without any consideration for any of the other trees, this is incredibly easy to parallelize, which means that it can train really quick. Whereas gradient boosting is iterative in that it relies on the results of the tree before it, in order to apply a higher weight to the ones that the previous tree got incorrect. So, boosting can't be parallelized, and it takes much longer to train.
- The final predictions for random forest are typically an unweighted average or an unweighted voting, while boosting uses a weighted voting.
- Lastly, random forest is easier to tune, faster to train and harder to overfit, while gradient boosting is harder to tune, slower to train, and easier to overfit.
So, with that why would you go with gradient boosting? Well, the trade-off is that gradient boosting is typically more powerful and better-performing if tuned properly.
What are the benefits of using gradient boosting?
- Well, it's one of the most powerful machine learning classifiers out there. It also accepts various types of inputs just like random forest, so it makes it very flexible. It can also be used for classification or regression, and the outputs feature importance which can be super useful. But it's not perfect. Some of the drawbacks are that it takes longer to train because it can't be parallelized, it's more likely to overfit because it obsesses over those ones that it got wrong, and it can get lost pursuing those outliers that don't really represent the overall population.
What are Bias and Variance?
- The prediction error in machine learning algorithms can be divided into three types-
o Bias error,
o Variance error and
o Irreducible error
- The irreducible error cannot be reduced whatever algorithm is used. So, we will focus into Bias and variance error.
- Bias is the assumptions made by the model to make the target function easier to approximate. High bias can cause an algorithm to miss the relevant relations between features and target outputs (under fitting).
- Variance is the amount that the estimate of the target function will change given different training data. High variance can cause an algorithm to model the random noise in the training data, rather than the intended outputs (over-fitting).
What is Bias Variance trade-off?
- The bias and variance trade-off is an import aspect of machine learning algorithm. To get an accurate model, an engineer’s goal is to reduce the bias and variance as much as possible. However, it is not feasible in real life. If a learning algorithm has low bias it must be very flexible so the it can fir any data. But if the learning algorithm is too flexible it will fit ever training data set and increase the variance error. So, there should be a trade-off between bias and variance when selecting models of different flexibility or complexity and in selecting appropriate training sets to minimize these sources of error!
Explain the difference between L1 and L2 regularization
- L2 regularization tends to spread error among all the terms, while L1 is more binary/sparser, with many variables either being assigned a 1 or 0 in weighting.
Difference between KMEAN and KNN(K Nearest Neighbor) algorithms
- The main difference is Kmean clustering is unsupervised whereas KNN is supervised machine learning algorithm. Which means KNN needs labelled data for prediction but Kmean doesn’t need as it is unsupervised.
- Kmean is used for clustering problem whereas KNN is a supervised learning algorithm used for classification and regression problem.
What are different Machine Learning techniques?
- The different type of machine learning algorithms are-
Congratulation!!! Now you know what Artificial Intelligence and Machines Learning is. Now we can go little deeper and learn about different Machine Learning algorithms.
The most important question which comes to a beginner mind is “which algorithm should I use?” The answer to the question varies depending on many factors, including: The size, quality, and nature of data; The available computational time; The urgency of the task; and What you want to do with the data. Even an experienced data scientist cannot tell which algorithm will perform the best before trying different algorithms.
Before going into the algorithms, first we will see what Supervised, Un-Supervised, Semi-Supervised machine and Reinforcement Learning algorithms are.
What is Supervised Machine Learning?
- In SupervisedMachine learning, the Machine is given a set of data which already knows how the output should look and have an idea about the relation between the input and out. Supervised learning problems are also categorized as “regression” and “classification” problems. In regressionproblems machine predict a numeric or continuous variable output where as in classification problems the predicted output is discrete. For example, if the machine is given a dataset of house prices with respect to house size, it can predict an unknown house price. Whereas if some image are labelled as dogs and cats, the machine can learn the relation between them and classify and separate some image as dog or cat. Below image may give you a better understanding-
What is Un-Supervised Machine Learning?
- Un-Supervised allows the machine to approach a problem with minimum or no idea about how the output will look like. It can drive structures and relations from the given dataset and can find hidden patterns or grouping information from the data. It is mainly used for clustering, dimensionality reduction, feature learning, density estimation, etc. Example- KMean Clustering.
What is Semi-Supervised Machine Learning?
- Semi-supervised machine learning algorithms fall somewhere in between supervised and unsupervised learning, since they use both labeled and unlabeled data for training – typically a small amount of labeled data and a large amount of unlabeled data. The systems that use this method are able to considerably improve learning accuracy. Usually, semi-supervised learning is chosen when the acquired labeled data requires skilled and relevant resources in order to train it / learn from it. Otherwise, acquiring unlabeled data generally doesn’t require additional resources. Example: speech recognition.
What is Reinforcement Learning?
- Reinforcement learning algorithms is a learning method that interacts with its environment by producing actions and discovers errors or rewards. Trial and error search and delayed reward are the most relevant characteristics of reinforcement learning. This method allows machines and software agents to automatically determine the ideal behaviour within a specific context in order to maximize its performance. It is employed by various software and machines to find the best possible behaviour or path it should take in a specific situation.
Okay, all set, we are now ready to learn the most popular machine learning algorithms, stay tuned for that. Please comment below for any suggestion and feedback.
Today we will start our journey to the world of Artificial Intelligence(AI). We will learn the basic definition of Artificial Intelligence (AI), Machine Learning(ML), Deep Learning(DL), Natural Language Processing(NLP), Computer Vision and Image Processing. Later we will go deeper with the machine learning algorithms and how those algorithm works. This tutorial is for beginners, if you have an idea of AI skip this course and go to the next lesson where I will discuss different Machine Learning algorithms.
What is Artificial Intelligence(AI)?
- Artificial intelligence (AI) is the ability of a machine or a computer program to think and learn by doing certain task. The concept of AI is based on the idea of building machines capable of thinking, acting, and learning like humans. On other words the creating the machine capable of understanding the environment, understanding the problem and act intelligently according to the situation.
What is Machine Learning(ML)?
- Machine Learning(ML) is an application of AI that provides system the ability to automatically learn and improve performance without being explicitly programmed. ML focuses on the development of computer program that can access data and learn for themselves. The main aim is to allow computer learn automatically without human intervention or assistance and act accordingly.
- Next question in your mind may have, how the machine is learning? – The answer is as human learns. Frist the machine gathers information and knowledge then use those knowledge to take decisions. Also, past experiences helps to take decisions in future.
What is Deep Learning(DL) or Deep Neural Network(DNN)?
- Deep Learning(DL) is part of a broader family of Machine Learning and AI, which emulate the learning approach that human beings use to gain certain types of knowledge. Traditionally machine learning algorithms used to be linear, but with deep learning algorithms are stacked in a hierarchy of increasing complexity and abstraction. Because this process mimics a system of human neurons, deep learning is sometimes referred to as Deep Neural Learning(DNN) or deep neural networking. Let me explain the concept with an example blow-
- A baby when starts learning about what a cat is (and is not) by pointing to some objects and saying the word cat. The parent guides him by saying, "Yes, that is a cat," or, "No, that is not a cat." As the baby continues to point to objects, he becomes more aware of the features that all cat have. What the baby does, without knowing it, is clarify a complex abstraction by building a hierarchy in which each level of abstraction is created with knowledge that was gained from the preceding layer of the hierarchy. A machine follows more or less similar approach. Each algorithm in the hierarchy applies a nonlinear transformation on its input and uses what it learns to create a statistical model as output. Iterations continue until the output has reached an acceptable level of accuracy. The number of processing layers through which data must pass is what inspired the label deep.
What is Natural Language Processing(NLP)?
- Natural Language Processing is the ability of a computer program to understand human languages as it is spoken. NLP is also component of AI. The development of NLP is challenging because traditionally computer requires human to speak to them in a programming language or unambiguous or highly structured, clear commands. Whereas natural languages are generally ambiguous, have different structures, dialects, regional effects which are difficult to distinguish.
- Semantic analysis and Natural Language Processing can help machines automatically understand text, which supports the even larger goal of translating information, understanding potentially valuable piece of customer feedback, understanding insight in a tweet or in a customer service log into the realm of business intelligence for customer support, corporate intelligence or knowledge management.
What is Computer Vision and Image Processing?
- Computer vision is about granting the computer the ability to ‘see’ and ‘understand’ what it sees. In image processing you get an image as input and provide processed image as output, whereas in computer vision you get an image (or video) as input and provide other quantitative data as an output (e.g geometrical information about the objects in question). Computer Vision tries to do what a human brain does with the retinal input, it includes understanding and predicting, detecting certain things. For example, given an input image, using computer vision the computer can classify the objects (cars,humans,train.. etc) as human does. There are many other applications but this is just to give you a basic idea.
This was the basic concepts. Please comment below if you have any questions or feedback. Stay tuned for more detailed concepts of Machine Learning Algorithms.
Hi all, Hope you all are doing all. Today I will explain the face detection procedure used in opencv.
Step1: create cascaded classifier using the training algorithm provided by opencv and get the .xml file.
Step2: load the pre-made .xml file.
Step3: input frame from camera/ input image and convert it to grey scale image.
Step4: use opencv’s ‘CascadeClassifier:: detectMultiScale()’ function to detect faces of different sizes in the input image.
Explanation of CascadeClassifier::detectMultiScale() – Parameters:
i. image - Matrix of the type CV_8U containing an image where objects are detected.
ii. objects – Vector of rectangles where each rectangle contains the detected object.
iii. scaleFactor – Parameter specifying how much the image size is reduced at each image scale.
iv. minNeighbors – Parameter specifying how many neighbour's each candidate rectangle should have to retain it.
v. minSize – Minimum possible object size. Objects smaller than that are ignored.
vi. maxSize – Maximum possible object size. Objects larger than that are ignored.
• Basically what ‘CascadeClassifier:: detectMultiScale()’ does is it takes the original image and creates an image pyramid from it, using the resize factor and searches for faces/objects in it. Image pyramid is a multi-scale representation of an image, such that the face detection can be scale-invariant, i.e., detecting large and small faces using the same detection window.
• This gives the ability of detecting faces/objects at a single model scale, throughout different images scales, meaning that if a detection happens at a specific layer, the bounding box will be rescaled the same amount as the original image was to reach that pyramid layer.
• Using this technique you can detect multiple people scales at only a single model scale, which is computationally less expensive than training a model for each possible scale and running those over the single image.
• To make a good detector you need to train the cascade properly with a good number of sample images. Here is an example-
I have never thought that, I will write a blog someday. But last 2 weeks I struggled a lot to find-out how to use Opencv's utility to train a cascaded classifier. I could not find a single tutorial talking about the newest applications of OpenCV to train cascade classifier i.e opencv_traincascade. All the blogs and tutorials I found only talks about the older version which is opencv_haartraining.
Currently there are two applications in OpenCV to train cascade classifier: opencv_haartraining and opencv_traincascade. opencv_haartraining is now an obsolete application, and also you can find many tutorials talking about it so I will only talk about opencv_traincascade.
A. First thing first, Install Opencv. Once you have installed OpenCV look under your opencv/apps folder you can see two folders "haartraining" and "traincascade". We will use "traincascade" folder for training. And We will use the "createsamples.cpp" from the "haartraining" folder to create the positive samples. I will explain it step by step.
B. First we will make a createsamples.exe by building the .cpp files using the Visual Studio 2010. To do so, follow the below steps- a. open Visual Studio 2010 -> new project ->project name(createsample)->empty Project->finish b. In the right hand in Solution Explorer->Header Files, right click on Header Files and add exiting files - "_cvcommon.h" , "_cvhaartraining.h" , "cvclassifier.h", "cvhaartraining.h". c. Right click on Solution Explorer->Source Files, And add existing files - "createsamples.cpp", "cvboost.cpp" , "cvcommon.cpp", cvhaarclassifier.cpp", "cvhaartraining.cpp", "cvsamples.cpp". d. Add all the include folders, library folder and additional dependencies in the Project property and build the solution. f. You can find the createsamples.exe in you project debug folder.
C.Similarly we will make traincascade.exe. For traincasacde.exe we will add "boost.h" , "cascadeclassifier.h", "haarfeature.h", imagestorage.h" , Ibpfeatures.h", "traincascade_features.h" in the "Heade Files" and in the "Source Files" we will add "boost.cpp", "cascadeclassifier.cpp", "features.cpp", "haarfeatures.cpp", "imagesstorage.cpp", "lbpfeatures.cpp", "traincascade.cpp". D. Now that we have we have createsample.exe and traincascade.exewe can start the main process. a.Create a folder in a fresh path say, C:\haar b.Create 2 folders inside C:\haar i. Positive, ii. Negative c.Positive Images: These images contain the object to be detected. Inside the positive folder keep all the positive images. Then create a ‘positive.txt’ file which will have the location of the image, number of positive samples present in that images and coordinate of all those samples in the image. Below is the example of ‘positive.txt’ , which contain location of the image positive1, number of positive sample in each image, x & y coordinate of the sample and height and width of the image.
d.After having the positive images and the information text file we will create a “vec” file. During haartraining positive samples should have different height and width , So original positive images are resized and packed as thumbs to vec file. Vec file has header: number of positive samples, width, height and contain positive thumbs in body. To create the vec file we will use the command prompt(cmd) and type the follow-
Where 100 is the number of positive samples that are used for training and 24 will be the height and width of the positive samples. This ‘vec file’ will be created at “c:\haar” we can see it by typing the following in the cmd
C:\haar\createsamples.exe –vec positives.ec –w 24 –h 24 e.Negative Images: Negative images could be anything that does not contain positive samples. Keep all the negative images in “C:\haar\Negative” and create a ‘negative.txt’ file containing full paths of all negative images. something like this- Negative\negative1.jpg Negative\negative1.jpg ...E. Now that we have all the things we need to train a cascade classifier we can call the traincascade.exe from the command prompt. Do start training you need to type the following command-
C:\haar\traincascade.exe -data result/ -info positive.vec -bg negative.txt -numPos 2000 -numNeg 3000 -numStages 20 -w 24 -h 24 Where'-data' specifies the directory where the the result will be saves as a .xml file. The final result of the training will be saved in "cascade.xml" file , you can remove all other files inside the result folder. '-info' specifies the information file of the positive images i. e the .vec file. '-bg' specifies the background file that is the text file having information about the negative images. '-numPos' means the number of positive images and '-numNeg' menas the number of negative images. '-numStages' means the number of stages should be used for the training process. Once the training process starts you can see some thing like this-
Be patient , the whole process might take 4-5 days to complete.