# Something More for Research

• ## Follow Blog via Email

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 5,056 other followers

• ## Blog Stats

• 160,729 hits

# Archive for January, 2015

## Professional ways of tracking GPU memory leakage

Posted by Hemprasad Y. Badgujar on January 25, 2015

Depending on what I am doing and what I need to track/trace and profile I utilise all 4 packages above. They also have the added benefit of being a: free; b: well maintained; c: free; d: regularly updated; e: free.

In case you hadn’t guessed I like the free part:)

In regards of object management, I would recommend an old C++ coding principle: as soon as you create an object, add the line that deletes it, every new should always (eventually) have a delete. That way you know that you are destroying the objects you create, however it will not save you from orphaned memory block memory leaks, where you change where pointers are pointing, for example:

myclass* firstInstance = new myclass();
myclass* secondInstance = new myclass();
firstInstance = secondInstance;
delete firstInstance;
delete secondInstance;


You will now have created a small memory leak where the data for the real firstInstance is now not being pointed at by any pointer. Very hard to detect when this happens in a large code-base, and more common that it should be.

generally these are the pairings you need to be aware of to ensure you properly dispose of all your objects:

new -> delete
new[] -> delete[]
malloc() -> free() // or you can use realloc(0) instead of free()
calloc() -> free() // or you can use realloc(0) instead of free()
realloc(nonzero) -> free() // or you can use realloc(0) instead of free()


If you are coming from a language with garbage collection to C++ it can take a while to get used to, but it quickly becomes habit:)

## Building Static zlib v1.2.7 with MSVC 2012

Posted by Hemprasad Y. Badgujar on January 22, 2015

This post will explain how to obtain and build the zlib C programming library from source, using MS Visual Studio 2012 on Windows 7. The result will be a static release version that you can use in your C/C++ projects for data compression, or as a dependency for other libraries.

## The Environment

1. Decompress anduntar the library with7zip and you’ll end up with a directory path similar to this:
 1 C:\Users\%USERNAME%\Downloads\lib\zlib-1.2.7\

## Building

1. Modify “libs\zlib-1.2.7\contrib\masmx86\bld_ml32.bat,” adding “/safeseh” to the following two lines.
Before:

 1 2 ml /coff /Zi /c /Flmatch686.lst match686.asm ml /coff /Zi /c /Flinffas32.lst inffas32.asm

After:

 1 2 ml /safeseh /coff /Zi /c /Flmatch686.lst match686.asm ml /safeseh /coff /Zi /c /Flinffas32.lst inffas32.asm
2. Open the solution file that came with the package, “libs\zlib-1.2.7\contrib\vstudio\vc10\zlibvc.sln,” and upgrade the solution file if necessary to MSVC 2012.
3. Change to “Release” configuration.
4. Remove “ZLIB_WINAPI;” from the “zlibstat” project’s property page: “Configuration Properties → C/C++ → Preprocessor → Preprocessor Definitions
5. Build the solution.
6. The new static library fileis created in a newsubfolder:
 1 C:\Users\%USERNAME%\Downloads\lib\zlib-1.2.7\contrib\vstudio\vc10\x86\ZlibStatRelease\zlibstat.lib

## Installing

1. Create a place for the zlib library with “zlib” and “lib”subfolders.
 1 2 mkdir "C:\workspace\lib\zlib\zlib-1.2.7\zlib" mkdir "C:\workspace\lib\zlib\zlib-1.2.7\lib"
2. Copy the header files.
 1 xcopy "C:\Users\%USERNAME%\Downloads\lib\zlib-1.2.7\*.h" "C:\workspace\lib\zlib\zlib-1.2.7\zlib"
3. Copy the library file.
 1 xcopy "C:\Users\%USERNAME%\Downloads\lib\zlib-1.2.7\contrib\vstudio\vc10\x86\ZlibStatRelease\zlibstat.lib" "C:\workspace\lib\zlib\zlib-1.2.7\lib\zlibstat.lib"
4. Add the include and lib paths to the default project property page in MSVC 2012:
View → Other Windows → Property Manager → Release/Debug → Microsoft.Cpp.Win32.user
.
Be sure to save the property sheet so that the changes take effect.

## Testing

1. Create a new project, “LibTest” in MSVC 2012.
2. Explicitly add the zlib library to the project: Project → Properties →Linker → Input → Additional Dependencies = “zlibstat.lib;”
3. Create a source file in the project and copy the “zpipe.c” example code.

Build the project. It should compile and link successfully.

## Potential Issues

These are some of the problems that you might run into while trying to build zlib.

### LNK2026: module unsafe for SAFESEH image

Need to include support for safe exception handling. Modify “libs\zlib-1.2.7\contrib\masmx86\bld_ml32.bat,” adding “/safeseh” to the following two lines.
Before:

 1 2 ml /coff /Zi /c /Flmatch686.lst match686.asm ml /coff /Zi /c /Flinffas32.lst inffas32.asm

After:

 1 2 ml /safeseh /coff /Zi /c /Flmatch686.lst match686.asm ml /safeseh /coff /Zi /c /Flinffas32.lst inffas32.asm

### LNK2001: unresolved external symbol _inflateInit_

The code is trying to link with the DLL version of the library instead of the static version. Remove “ZLIB_WINAPI;” from the “zlibstat” project’s property page: “Configuration Properties → C/C++ → Preprocessor → Preprocessor Definitions

## Detect and Track Objects

Posted by Hemprasad Y. Badgujar on January 20, 2015

## Detect and Track Objects With OpenCV

In the following, I made an overview of tutorials and guides to getting strted how to use OpenCV for detection and tracking objects. OpenCV is a library for computer visions designed for analyze, process, and understand the objects from images aiming to produce information.

• OpenCV Tutorials – comprehensive list with basic OpenCV tutorials and source code based on the OpenCV library;
• Object Detection & Tracking Using Color – example of application where OpenCV is used to detect objects based on color differences;
• Face Detection Using OpenCV – guide how to use OpenCV to detect one or more faces from the same image;
• SURF in OpenCV – tutorial how to use SURF algorithm designed to detect key-points and descriptors in images;
• Introduction to Face Detection and Face Recognition – face detection and recognition are two of the most common applications in computer vision from robotics, and this tutorial present the steps how a face is detected and recognized from images;
• Find Objects with a Webcam – using a simple webcam mounted on a robot and the Simple Qt interface designed to work with OpenCV, as you can see in this tutorial any object can be detected and tracked in images;
• Features 2D + Homography to Find a Known Object – tutorial with programming code and explanation in order to use two important functions included in OpenCV. These two functions – findHomography and perspectiveTransform – are used to find objects in images. The findHomography is a function based on a technique called Key-point Matching, while the perspectiveTransform is an advanced class capable of mapping the points from an image;
• Back Projection – tutorial based on calcBackProject function designed to calculate the back project of the histogram;
• Tracking Colored Objects in OpenCV – tutorial for detection and tracking the colored objects from images using the OpenCV library;
• OpenCV Tutorials – Based on “Learning OpenCV – Computer Vision with the OpenCV Library” – in order to be familiar with computer vision concepts, these tutorials can be useful for beginner and advanced users to start building applications or to improve the skills;
• Image Processing on Pandaboard using OpenCV and Kinect – in this presentation you can find information about image processing with a Pandaboard single board computer using the Kinect sensor and the OpenCV library;
• Video Capture using OpenCV with VC++ – OpenCV library can be integrated with Visual Studio and this article explain you as a programmer how to use the Visual C++ together with OpenCV;

## Tutorials for Detecting and Tracking Objects with Mobile Devices

Mobile devices such as smartphones and tablets with iOS or Android operating systems can be integrated into robots and used to detect and track objects. Below is an overview of tutorials with comprehensive information for tracking objects using different mobile devices.

## Particle filter based trackers

• Particle Filter Color Tracker [Link 1]
• Matlab and c/c++ code.
• Key words: region tracker, color histogram, ellipsoidal region, particle filter, SIR resampling.
• Region Tracker based on a color Particle Filter [Link 1] [Example]
• Matlab and c/c++ code.
• Key words: region tracker, color histogram, ellipsoidal region, particle filter, SIR resampling.
• Region Tracker based on an intensity Particle Filter [Link]
• Matlab and c/c++ code.
• Key words: region tracker, intensity histogram, ellipsoidal region, particle filter, SIR resampling.
• Particle Filter Object Tracking [Link]
• C/C++.

## Mean shift based trackers

• Scale and Orientation Adaptive Mean Shift Tracking. [Link]
• Matlab.
• Robust Mean Shift  Tracking with Corrected Background-Weighted Histogram. [Link]
• Matlab.
• Robust Object Tracking using Joint Color-Texture Histogram. [Link]
• Matlab.
• Tracking moving video objects using mean-shift algorithm. [Link]
• Matlab.
• Mean-shift Based Moving Object Tracker [Link]
• C/C++.
• Mean-Shift Video Tracking [Link]
• Matlab.
• Gray scale mean shift algorithm for tracking. [Link]
• Matlab.
• Mean shift tracking algorithm for tracking [Link]
• Matlab.

## Deformable/articulable object trackers

• Visual Tracking with Integral Histograms and Articulating Blocks [Link]
• Matlab and c/c++ code
• Key words: region tracker, intensity histogram, multi-rectangular regions, integral histogram, exhaustive search, graph cut segmentation.

## Appearance learning based trackers

• Robust Object Tracking with Online Multiple Instance Learning. [Link]
• C/C++.
• Visual Tracking via Adaptive Structural Local Sparse Appearance Model. [Link]
• Matlab.
• Online Discriminative Object Tracking with Local Sparse Representation. [Link]
• Matlab
• Superpixel Tracking. [Link]
• Matlab.
• Online Multiple Support Instance Tracking. [Link]
• Matlab.
• Incremental Learning for Robust Visual Tracking. [Link]
• Matlab.
• Tracking with Online Multiple Instance Learning (MILTrack). [Link]
• C/C++, OpenCV
• Predator. [Link]
• Matlab.
• Object Tracking via Partial Least Squares Analysis. [Link]
• Matlab.
• Robust Object Tracking via Sparsity-based Collaborative Model. [Link]
• Matlab.
• On-line boosting trackers. [Link]
• C/C++.

## Advanced appearance model based trackers

• Real-Time Compressive Tracking [Link]

## Resources

Below is a list with resources including OpenCV documentation, libraries, and OpenCV compatible tools.

## Draw graphs using OpenCV

Posted by Hemprasad Y. Badgujar on January 20, 2015

basic graphing library to allow plotting graphs on the screen or into an image using OpenCV. This can be very useful to view the contents of a numerical array, such as during testing of an algorithm.

This “library” is just a collection of functions that can be used to simply plot a graph of an array in its own window, or to overlay graphs into existing an IplImage. This makes it both easy to use and powerful enough for more complex uses such as combining multiple graphs into one.

### Showing a simple graph of an array

Here is a simple example to see a graph of a float array in a new window, by calling:

showFloatGraph("Rotation Angle", floatArray, numFloats );


The same sort of graph could be shown from a std::vector of floats or ints or even a byte array:

showFloatGraph("Rotation Angle", &floatVector[0], floatVector.size());
showIntGraph("Rotation Angle", &intVector[0], intVector.size());
showUCharGraph("Pixel Values", pixelData, numPixels);


Note that the window will only stay active for half a second by default. To make it wait until the user hits a key, add “0” as the wait time, by calling:

showIntGraph("Rotation Angle", &intVector[0], intVector.size(), 0);


### Drawing multiple graphs into an IplImage

It is also possible to draw a graph into an existing image of your own, such as to overlay a graph on top of your work, or to graph multiple values in the same graph:

IplImage *graphImg = drawFloatGraph(&floatVec1[0], floatVec1.size(), NULL,
-25,25, 400,180, "X Angle (blue is truth, green is POSIT)" );
drawFloatGraph(&floatVec2[0], floatVec2.size(), graphImg, -25,25, 400,180);
cvSaveImage("my_graph.jpg", graphImg);
cvReleaseImage(&graphImg);


### Overlaying graphs onto an existing IplImage

You can also plot the graphs onto existing images, as shown here:

IplImage *bgImg = cvLoadImage("my_background_photo.jpg");
int w = bgImg->width;
int h = bgImg->height;
drawFloatGraph(floatArray, numFloats, bgImg, -25,25, w, h, "Yaw (in degrees)");
showImage(bgImg, 0, "Rotation Angle");
cvReleaseImage(&bgImg);


Here is a more complex example, where 3 graphs are drawn over a larger photo:

IplImage *dstImage = cvLoadImage("my_background_photo.jpg");
int W = 400, H = 200;
float RANGE = 25.0f;
char *name;

name = "X Angle (blue is truth, green is POSIT)";
setGraphColor(0);	// Start with a blue graph
// Set the position of the graph within the image
CvRect region = cvRect(dstImage->width-1 - W-10, 10, W+20, H+20);
cvSetImageROI(dstImage, region);
drawFloatGraph(&vecX1[0], vecX1.size(), dstImage, -RANGE,+RANGE, W,H, name);
drawFloatGraph(&vecX2[0], vecX2.size(), dstImage, -RANGE,+RANGE, W,H);

name = "Y Angle (blue is truth, green is POSIT)";
setGraphColor(0);	// Start with a blue graph
// Set the position of the graph within the image
region.y += H+20;
cvSetImageROI(dstImage, region);
drawFloatGraph(&vecY1[0], vecY1.size(), dstImage, -RANGE,+RANGE, W,H, name);
drawFloatGraph(&vecY2[0], vecY2.size(), dstImage, -RANGE,+RANGE, W,H);

name = "Z Angle (blue is truth, green is POSIT)";
setGraphColor(0);	// Start with a blue graph
// Set the position of the graph within the image
region.y += H+20;
cvSetImageROI(dstImage, region);
drawFloatGraph(&vecZ1[0], vecZ1.size(), dstImage, -RANGE,+RANGE, W,H, name);
drawFloatGraph(&vecZ2[0], vecZ2.size(), dstImage, -RANGE,+RANGE, W,H);

cvResetImageROI(dstImage);

showImage(dstImage);
cvReleaseImage(&dstImage);



//------------------------------------------------------------------------------
"ImageUtils.cpp",
//------------------------------------------------------------------------------

#define USE_HIGHGUI // Enable this to display graph windows using OpenCV's HighGUI. (Supports Windows, Linux & Mac, but not iPhone).

#include <stdio.h>
#include <iostream>
//#include <tchar.h>

// OpenCV
#include <cv.h>
#include <cxcore.h>
#ifdef USE_HIGHGUI
#include <highgui.h>
#endif

#ifndef UCHAR
typedef unsigned char UCHAR;
#endif

#include "GraphUtils.h"

//------------------------------------------------------------------------------
// Graphing functions
//------------------------------------------------------------------------------
const CvScalar BLACK = CV_RGB(0,0,0);
const CvScalar WHITE = CV_RGB(255,255,255);
const CvScalar GREY = CV_RGB(150,150,150);

int countGraph = 0; // Used by 'getGraphColor()'
CvScalar customGraphColor;
int usingCustomGraphColor = 0;

// Get a new color to draw graphs. Will use the latest custom color, or change between blue, green, red, dark-blue, dark-green and dark-red until a new image is created.
CvScalar getGraphColor(void)
{
if (usingCustomGraphColor) {
usingCustomGraphColor = 0;
return customGraphColor;
}

countGraph++;
switch (countGraph) {
case 1: return CV_RGB(60,60,255); // light-blue
case 2: return CV_RGB(60,255,60); // light-green
case 3: return CV_RGB(255,60,40); // light-red
case 4: return CV_RGB(0,210,210); // blue-green
case 5: return CV_RGB(180,210,0); // red-green
case 6: return CV_RGB(210,0,180); // red-blue
case 7: return CV_RGB(0,0,185); // dark-blue
case 8: return CV_RGB(0,185,0); // dark-green
case 9: return CV_RGB(185,0,0); // dark-red
default:
countGraph = 0; // start rotating through colors again.
return CV_RGB(200,200,200); // grey
}
}
// Call 'setGraphColor()' to reset the colors that will be used for graphs.
void setGraphColor(int index)
{
countGraph = index;
usingCustomGraphColor = 0; // dont use a custom color.
}
// Specify the exact color that the next graph should be drawn as.
void setCustomGraphColor(int R, int B, int G)
{
customGraphColor = CV_RGB(R, G, B);
usingCustomGraphColor = 1; // show that it will be used.
}

// Draw the graph of an array of floats into imageDst or a new image, between minV & maxV if given.
// Remember to free the newly created image if imageDst is not given.
IplImage* drawFloatGraph(const float *arraySrc, int nArrayLength, IplImage *imageDst, float minV, float maxV, int width, int height, char *graphLabel, bool showScale)
{
int w = width;
int h = height;
int b = 10; // border around graph within the image
if (w <= 20)
w = nArrayLength + b*2; // width of the image
if (h <= 20)
h = 220;

int s = h - b*2;// size of graph height
float xscale = 1.0;
if (nArrayLength > 1)
xscale = (w - b*2) / (float)(nArrayLength-1); // horizontal scale
IplImage *imageGraph; // output image

// Get the desired image to draw into.
if (!imageDst) {
// Create an RGB image for graphing the data
imageGraph = cvCreateImage(cvSize(w,h), 8, 3);

// Clear the image
cvSet(imageGraph, WHITE);
}
else {
// Draw onto the given image.
imageGraph = imageDst;
}
if (!imageGraph) {
std::cerr << "ERROR in drawFloatGraph(): Couldn't create image of " << w << " x " << h << std::endl;
exit(1);
}
CvScalar colorGraph = getGraphColor(); // use a different color each time.

// If the user didnt supply min & mav values, find them from the data, so we can draw it at full scale.
if (fabs(minV) < 0.0000001f && fabs(maxV) < 0.0000001f) {
for (int i=0; i
float v = (float)arraySrc[i];
if (v < minV)
minV = v;
if (v > maxV)
maxV = v;
}
}
float diffV = maxV - minV;
if (diffV == 0)
diffV = 0.00000001f; // Stop a divide-by-zero error
float fscale = (float)s / diffV;

// Draw the horizontal & vertical axis
int y0 = cvRound(minV*fscale);
cvLine(imageGraph, cvPoint(b,h-(b-y0)), cvPoint(w-b, h-(b-y0)), BLACK);
cvLine(imageGraph, cvPoint(b,h-(b)), cvPoint(b, h-(b+s)), BLACK);

// Write the scale of the y axis
CvFont font;
cvInitFont(&font,CV_FONT_HERSHEY_PLAIN,0.55,0.7, 0,1,CV_AA); // For OpenCV 1.1
if (showScale) {
//cvInitFont(&font,CV_FONT_HERSHEY_PLAIN,0.5,0.6, 0,1, CV_AA); // For OpenCV 2.0
CvScalar clr = GREY;
char text[16];
sprintf_s(text, sizeof(text)-1, "%.1f", maxV);
cvPutText(imageGraph, text, cvPoint(1, b+4), &font, clr);
// Write the scale of the x axis
sprintf_s(text, sizeof(text)-1, "%d", (nArrayLength-1) );
cvPutText(imageGraph, text, cvPoint(w-b+4-5*strlen(text), (h/2)+10), &font, clr);
}

// Draw the values
CvPoint ptPrev = cvPoint(b,h-(b-y0)); // Start the lines at the 1st point.
for (int i=0; i
int y = cvRound((arraySrc[i] - minV) * fscale); // Get the values at a bigger scale
int x = cvRound(i * xscale);
CvPoint ptNew = cvPoint(b+x, h-(b+y));
cvLine(imageGraph, ptPrev, ptNew, colorGraph, 1, CV_AA); // Draw a line from the previous point to the new point
ptPrev = ptNew;
}

// Write the graph label, if desired
if (graphLabel != NULL && strlen(graphLabel) > 0) {
//cvInitFont(&font,CV_FONT_HERSHEY_PLAIN, 0.5,0.7, 0,1,CV_AA);
cvPutText(imageGraph, graphLabel, cvPoint(30, 10), &font, CV_RGB(0,0,0)); // black text
}

return imageGraph;
}

// Draw the graph of an array of ints into imageDst or a new image, between minV & maxV if given.
// Remember to free the newly created image if imageDst is not given.
IplImage* drawIntGraph(const int *arraySrc, int nArrayLength, IplImage *imageDst, int minV, int maxV, int width, int height, char *graphLabel, bool showScale)
{
int w = width;
int h = height;
int b = 10; // border around graph within the image
if (w <= 20)
w = nArrayLength + b*2; // width of the image
if (h <= 20)
h = 220;

int s = h - b*2;// size of graph height
float xscale = 1.0;
if (nArrayLength > 1)
xscale = (w - b*2) / (float)(nArrayLength-1); // horizontal scale
IplImage *imageGraph; // output image

// Get the desired image to draw into.
if (!imageDst) {
// Create an RGB image for graphing the data
imageGraph = cvCreateImage(cvSize(w,h), 8, 3);

// Clear the image
cvSet(imageGraph, WHITE);
}
else {
// Draw onto the given image.
imageGraph = imageDst;
}
if (!imageGraph) {
std::cerr << "ERROR in drawIntGraph(): Couldn't create image of " << w << " x " << h << std::endl;
exit(1);
}
CvScalar colorGraph = getGraphColor(); // use a different color each time.

// If the user didnt supply min & mav values, find them from the data, so we can draw it at full scale.
if (minV == 0 && maxV == 0) {
for (int i=0; i
int v = arraySrc[i];
if (v < minV)
minV = v;
if (v > maxV)
maxV = v;
}
}
int diffV = maxV - minV;
if (diffV == 0)
diffV = 1; // Stop a divide-by-zero error
float fscale = (float)s / (float)diffV;

// Draw the horizontal & vertical axis
int y0 = cvRound(minV*fscale);
cvLine(imageGraph, cvPoint(b,h-(b-y0)), cvPoint(w-b, h-(b-y0)), BLACK);
cvLine(imageGraph, cvPoint(b,h-(b)), cvPoint(b, h-(b+s)), BLACK);

// Write the scale of the y axis
CvFont font;
cvInitFont(&font,CV_FONT_HERSHEY_PLAIN,0.55,0.7, 0,1,CV_AA); // For OpenCV 1.1
if (showScale) {
//cvInitFont(&font,CV_FONT_HERSHEY_PLAIN,0.5,0.6, 0,1, CV_AA); // For OpenCV 2.0
CvScalar clr = GREY;
char text[16];
sprintf_s(text, sizeof(text)-1, "%.1f", maxV);
cvPutText(imageGraph, text, cvPoint(1, b+4), &font, clr);
// Write the scale of the x axis
sprintf_s(text, sizeof(text)-1, "%d", (nArrayLength-1) );
cvPutText(imageGraph, text, cvPoint(w-b+4-5*strlen(text), (h/2)+10), &font, clr);
}

// Draw the values
CvPoint ptPrev = cvPoint(b,h-(b-y0)); // Start the lines at the 1st point.
for (int i=0; i
int y = cvRound((arraySrc[i] - minV) * fscale); // Get the values at a bigger scale
int x = cvRound(i * xscale);
CvPoint ptNew = cvPoint(b+x, h-(b+y));
cvLine(imageGraph, ptPrev, ptNew, colorGraph, 1, CV_AA); // Draw a line from the previous point to the new point
ptPrev = ptNew;
}

// Write the graph label, if desired
if (graphLabel != NULL && strlen(graphLabel) > 0) {
//cvInitFont(&font,CV_FONT_HERSHEY_PLAIN, 0.5,0.7, 0,1,CV_AA);
cvPutText(imageGraph, graphLabel, cvPoint(30, 10), &font, CV_RGB(0,0,0)); // black text
}

return imageGraph;
}

// Draw the graph of an array of uchars into imageDst or a new image, between minV & maxV if given..
// Remember to free the newly created image if imageDst is not given.
IplImage* drawUCharGraph(const uchar *arraySrc, int nArrayLength, IplImage *imageDst, int minV, int maxV, int width, int height, char *graphLabel, bool showScale)
{
int w = width;
int h = height;
int b = 10; // border around graph within the image
if (w <= 20)
w = nArrayLength + b*2; // width of the image
if (h <= 20)
h = 220;

int s = h - b*2;// size of graph height
float xscale = 1.0;
if (nArrayLength > 1)
xscale = (w - b*2) / (float)(nArrayLength-1); // horizontal scale
IplImage *imageGraph; // output image

// Get the desired image to draw into.
if (!imageDst) {
// Create an RGB image for graphing the data
imageGraph = cvCreateImage(cvSize(w,h), 8, 3);

// Clear the image
cvSet(imageGraph, WHITE);
}
else {
// Draw onto the given image.
imageGraph = imageDst;
}
if (!imageGraph) {
std::cerr << "ERROR in drawUCharGraph(): Couldn't create image of " << w << " x " << h << std::endl;
exit(1);
}
CvScalar colorGraph = getGraphColor(); // use a different color each time.

// If the user didnt supply min & mav values, find them from the data, so we can draw it at full scale.
if (minV == 0 && maxV == 0) {
for (int i=0; i<nArrayLength; i++) {
int v = arraySrc[i];
if (v < minV)
minV = v;
if (v > maxV)
maxV = v;
}
}
int diffV = maxV - minV;
if (diffV == 0)
diffV = 1; // Stop a divide-by-zero error
float fscale = (float)s / (float)diffV;

// Draw the horizontal & vertical axis
int y0 = cvRound(minV*fscale);
cvLine(imageGraph, cvPoint(b,h-(b-y0)), cvPoint(w-b, h-(b-y0)), BLACK);
cvLine(imageGraph, cvPoint(b,h-(b)), cvPoint(b, h-(b+s)), BLACK);

// Write the scale of the y axis
CvFont font;
cvInitFont(&font,CV_FONT_HERSHEY_PLAIN,0.55,0.7, 0,1,CV_AA); // For OpenCV 1.1
if (showScale) {
//cvInitFont(&font,CV_FONT_HERSHEY_PLAIN,0.5,0.6, 0,1, CV_AA); // For OpenCV 2.0
CvScalar clr = GREY;
char text[16];
sprintf_s(text, sizeof(text)-1, "%.1f", maxV);
cvPutText(imageGraph, text, cvPoint(1, b+4), &font, clr);
// Write the scale of the x axis
sprintf_s(text, sizeof(text)-1, "%d", (nArrayLength-1) );
cvPutText(imageGraph, text, cvPoint(w-b+4-5*strlen(text), (h/2)+10), &font, clr);
}

// Draw the values
CvPoint ptPrev = cvPoint(b,h-(b-y0)); // Start the lines at the 1st point.
for (int i=0; i<nArrayLength; i++) {
int y = cvRound((arraySrc[i] - minV) * fscale); // Get the values at a bigger scale
int x = cvRound(i * xscale);
CvPoint ptNew = cvPoint(b+x, h-(b+y));
cvLine(imageGraph, ptPrev, ptNew, colorGraph, 1, CV_AA); // Draw a line from the previous point to the new point
ptPrev = ptNew;
}

// Write the graph label, if desired
if (graphLabel != NULL && strlen(graphLabel) > 0) {
//cvInitFont(&font,CV_FONT_HERSHEY_PLAIN, 0.5,0.7, 0,1,CV_AA);
cvPutText(imageGraph, graphLabel, cvPoint(30, 10), &font, CV_RGB(0,0,0)); // black text
}

return imageGraph;
}

// Display a graph of the given float array.
// If background is provided, it will be drawn into, for combining multiple graphs using drawFloatGraph().
// Set delay_ms to 0 if you want to wait forever until a keypress, or set it to 1 if you want it to delay just 1 millisecond.
void showFloatGraph(const char *name, const float *arraySrc, int nArrayLength, int delay_ms, IplImage *background)
{
#ifdef USE_HIGHGUI
// Draw the graph
IplImage *imageGraph = drawFloatGraph(arraySrc, nArrayLength, background);

// Display the graph into a window
cvNamedWindow( name );
cvShowImage( name, imageGraph );

cvWaitKey( 10 ); // Note that cvWaitKey() is required for the OpenCV window to show!
cvWaitKey( delay_ms ); // Wait longer to make sure the user has seen the graph

cvReleaseImage(&imageGraph);
#endif
}

// Display a graph of the given int array.
// If background is provided, it will be drawn into, for combining multiple graphs using drawIntGraph().
// Set delay_ms to 0 if you want to wait forever until a keypress, or set it to 1 if you want it to delay just 1 millisecond.
void showIntGraph(const char *name, const int *arraySrc, int nArrayLength, int delay_ms, IplImage *background)
{
#ifdef USE_HIGHGUI
// Draw the graph
IplImage *imageGraph = drawIntGraph(arraySrc, nArrayLength, background);

// Display the graph into a window
cvNamedWindow( name );
cvShowImage( name, imageGraph );

cvWaitKey( 10 ); // Note that cvWaitKey() is required for the OpenCV window to show!
cvWaitKey( delay_ms ); // Wait longer to make sure the user has seen the graph

cvReleaseImage(&imageGraph);
#endif
}

// Display a graph of the given unsigned char array.
// If background is provided, it will be drawn into, for combining multiple graphs using drawUCharGraph().
// Set delay_ms to 0 if you want to wait forever until a keypress, or set it to 1 if you want it to delay just 1 millisecond.
void showUCharGraph(const char *name, const uchar *arraySrc, int nArrayLength, int delay_ms, IplImage *background)
{
#ifdef USE_HIGHGUI
// Draw the graph
IplImage *imageGraph = drawUCharGraph(arraySrc, nArrayLength, background);

// Display the graph into a window
cvNamedWindow( name );
cvShowImage( name, imageGraph );

cvWaitKey( 10 ); // Note that cvWaitKey() is required for the OpenCV window to show!
cvWaitKey( delay_ms ); // Wait longer to make sure the user has seen the graph

cvReleaseImage(&imageGraph);
#endif
}

// Simple helper function to easily view an image, with an optional pause.
void showImage(const IplImage *img, int delay_ms, char *name)
{
#ifdef USE_HIGHGUI
if (!name)
name = "Image";
cvNamedWindow(name, CV_WINDOW_AUTOSIZE);
cvShowImage(name, img);
cvWaitKey(delay_ms);
#endif
}
<pre>


Header File

//------------------------------------------------------------------------------
// Graphing functions for OpenCV. Part of "ImageUtils.h
//------------------------------------------------------------------------------

#ifndef GRAPH_UTILS_H
#define GRAPH_UTILS_H

#ifdef __cplusplus
extern "C"
{
#endif

// Allow 'bool' variables in both C and C++ code.
#ifdef __cplusplus
#else
typedef int bool;
#define true (1)
#define false (0)
#endif

#ifdef __cplusplus
#define DEFAULT(val) = val
#else
#define DEFAULT(val)
#endif

//------------------------------------------------------------------------------
// Graphing functions
//------------------------------------------------------------------------------

// Draw the graph of an array of floats into imageDst or a new image, between minV & maxV if given.
// Remember to free the newly created image if imageDst is not given.
IplImage* drawFloatGraph(const float *arraySrc, int nArrayLength, IplImage *imageDst DEFAULT(0), float minV DEFAULT(0.0), float maxV DEFAULT(0.0), int width DEFAULT(0), int height DEFAULT(0), char *graphLabel DEFAULT(0), bool showScale DEFAULT(true));

// Draw the graph of an array of ints into imageDst or a new image, between minV & maxV if given.
// Remember to free the newly created image if imageDst is not given.
IplImage* drawIntGraph(const int *arraySrc, int nArrayLength, IplImage *imageDst DEFAULT(0), int minV DEFAULT(0), int maxV DEFAULT(0), int width DEFAULT(0), int height DEFAULT(0), char *graphLabel DEFAULT(0), bool showScale DEFAULT(true));

// Draw the graph of an array of uchars into imageDst or a new image, between minV & maxV if given.
// Remember to free the newly created image if imageDst is not given.
IplImage* drawUCharGraph(const uchar *arraySrc, int nArrayLength, IplImage *imageDst DEFAULT(0), int minV DEFAULT(0), int maxV DEFAULT(0), int width DEFAULT(0), int height DEFAULT(0), char *graphLabel DEFAULT(0), bool showScale DEFAULT(true));

// Display a graph of the given float array.
// If background is provided, it will be drawn into, for combining multiple graphs using drawFloatGraph().
// Set delay_ms to 0 if you want to wait forever until a keypress, or set it to 1 if you want it to delay just 1 millisecond.
void showFloatGraph(const char *name, const float *arraySrc, int nArrayLength, int delay_ms DEFAULT(500), IplImage *background DEFAULT(0));

// Display a graph of the given int array.
// If background is provided, it will be drawn into, for combining multiple graphs using drawIntGraph().
// Set delay_ms to 0 if you want to wait forever until a keypress, or set it to 1 if you want it to delay just 1 millisecond.
void showIntGraph(const char *name, const int *arraySrc, int nArrayLength, int delay_ms DEFAULT(500), IplImage *background DEFAULT(0));

// Display a graph of the given unsigned char array.
// If background is provided, it will be drawn into, for combining multiple graphs using drawUCharGraph().
// Set delay_ms to 0 if you want to wait forever until a keypress, or set it to 1 if you want it to delay just 1 millisecond.
void showUCharGraph(const char *name, const uchar *arraySrc, int nArrayLength, int delay_ms DEFAULT(500), IplImage *background DEFAULT(0));

// Simple helper function to easily view an image, with an optional pause.
void showImage(const IplImage *img, int delay_ms DEFAULT(0), char *name DEFAULT(0));

// Call 'setGraphColor(0)' to reset the colors that will be used for graphs.
void setGraphColor(int index DEFAULT(0));
// Specify the exact color that the next graph should be drawn as.
void setCustomGraphColor(int R, int B, int G);

#if defined (__cplusplus)
}
#endif

#endif //end GRAPH_UTILS
<pre>

Posted in OpenCV | Leave a Comment »

## Real-Time Object Tracking

Posted by Hemprasad Y. Badgujar on January 19, 2015

## Real-Time Object Tracking Using OpenCV

</pre>
<pre>//objectTrackingTutorial.cpp

//Written by  Kyle Hounslow 2013

//Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software")
//, to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense,
//and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

//The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

//THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
//FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
//LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
//IN THE SOFTWARE.

#include
#include
#include
#include <opencv\highgui.h>
#include <opencv\cv.h>

using namespace cv;
//initial min and max HSV filter values.
//these will be changed using trackbars
int H_MIN = 0;
int H_MAX = 256;
int S_MIN = 0;
int S_MAX = 256;
int V_MIN = 0;
int V_MAX = 256;
//default capture width and height
const int FRAME_WIDTH = 640;
const int FRAME_HEIGHT = 480;
//max number of objects to be detected in frame
const int MAX_NUM_OBJECTS=50;
//minimum and maximum object area
const int MIN_OBJECT_AREA = 20*20;
const int MAX_OBJECT_AREA = FRAME_HEIGHT*FRAME_WIDTH/1.5;
//names that will appear at the top of each window
const string windowName = "Original Image";
const string windowName1 = "HSV Image";
const string windowName2 = "Thresholded Image";
const string windowName3 = "After Morphological Operations";
const string trackbarWindowName = "Trackbars";
void on_trackbar( int, void* )
{//This function gets called whenever a
// trackbar position is changed

}
string intToString(int number){

std::stringstream ss;
ss << number;
return ss.str();
}
void createTrackbars(){
//create window for trackbars

namedWindow(trackbarWindowName,0);
//create memory to store trackbar name on window
char TrackbarName[50];
sprintf( TrackbarName, "H_MIN", H_MIN);
sprintf( TrackbarName, "H_MAX", H_MAX);
sprintf( TrackbarName, "S_MIN", S_MIN);
sprintf( TrackbarName, "S_MAX", S_MAX);
sprintf( TrackbarName, "V_MIN", V_MIN);
sprintf( TrackbarName, "V_MAX", V_MAX);
//create trackbars and insert them into window
//3 parameters are: the address of the variable that is changing when the trackbar is moved(eg.H_LOW),
//the max value the trackbar can move (eg. H_HIGH),
//and the function that is called whenever the trackbar is moved(eg. on_trackbar)
//                                  ---->    ---->     ---->
createTrackbar( "H_MIN", trackbarWindowName, &H_MIN, H_MAX, on_trackbar );
createTrackbar( "H_MAX", trackbarWindowName, &H_MAX, H_MAX, on_trackbar );
createTrackbar( "S_MIN", trackbarWindowName, &S_MIN, S_MAX, on_trackbar );
createTrackbar( "S_MAX", trackbarWindowName, &S_MAX, S_MAX, on_trackbar );
createTrackbar( "V_MIN", trackbarWindowName, &V_MIN, V_MAX, on_trackbar );
createTrackbar( "V_MAX", trackbarWindowName, &V_MAX, V_MAX, on_trackbar );

}
void drawObject(int x, int y,Mat &frame){

//use some of the openCV drawing functions to draw crosshairs
//on your tracked image!

//UPDATE:JUNE 18TH, 2013
//added 'if' and 'else' statements to prevent
//memory errors from writing off the screen (ie. (-25,-25) is not within the window!)

circle(frame,Point(x,y),20,Scalar(0,255,0),2);
if(y-25>0)
line(frame,Point(x,y),Point(x,y-25),Scalar(0,255,0),2);
else line(frame,Point(x,y),Point(x,0),Scalar(0,255,0),2);
if(y+25<frame_height) line(frame,point(x,y),point(x,y+25),scalar(0,255,0),2);="" else="" line(frame,point(x,y),point(x,frame_height),scalar(0,255,0),2);="" if(x-25="">0)
line(frame,Point(x,y),Point(x-25,y),Scalar(0,255,0),2);
else line(frame,Point(x,y),Point(0,y),Scalar(0,255,0),2);
if(x+25<frame_width) line(frame,point(x,y),point(x+25,y),scalar(0,255,0),2);="" else="" line(frame,point(x,y),point(frame_width,y),scalar(0,255,0),2);="" puttext(frame,inttostring(x)+","+inttostring(y),point(x,y+30),1,1,scalar(0,255,0),2);="" }="" void="" morphops(mat="" &thresh){="" create="" structuring="" element="" that="" will="" be="" used="" to="" "dilate"="" and="" "erode"="" image.="" the="" chosen="" here="" is="" a="" 3px="" by="" rectangle="" mat="" erodeelement="getStructuringElement(" morph_rect,size(3,3));="" dilate="" with="" larger="" so="" make="" sure="" object="" nicely="" visible="" dilateelement="getStructuringElement(" morph_rect,size(8,8));="" erode(thresh,thresh,erodeelement);="" dilate(thresh,thresh,dilateelement);="" trackfilteredobject(int="" &x,="" int="" &y,="" threshold,="" &camerafeed){="" temp;="" threshold.copyto(temp);="" these="" two="" vectors="" needed="" for="" output="" of="" findcontours="" vector<="" vector<point=""> > contours;
vector hierarchy;
//find contours of filtered image using openCV findContours function
findContours(temp,contours,hierarchy,CV_RETR_CCOMP,CV_CHAIN_APPROX_SIMPLE );
//use moments method to find our filtered object
double refArea = 0;
bool objectFound = false;
if (hierarchy.size() > 0) {
int numObjects = hierarchy.size();
//if number of objects greater than MAX_NUM_OBJECTS we have a noisy filter
if(numObjects<max_num_objects){ for="" (int="" index="0;">= 0; index = hierarchy[index][0]) {

Moments moment = moments((cv::Mat)contours[index]);
double area = moment.m00;

//if the area is less than 20 px by 20px then it is probably just noise
//if the area is the same as the 3/2 of the image size, probably just a bad filter
//we only want the object with the largest area so we safe a reference area each
//iteration and compare it to the area in the next iteration.
if(area>MIN_OBJECT_AREA && arearefArea){
x = moment.m10/area;
y = moment.m01/area;
objectFound = true;
refArea = area;
}else objectFound = false;

}
//let user know you found an object
if(objectFound ==true){
putText(cameraFeed,"Tracking Object",Point(0,50),2,1,Scalar(0,255,0),2);
//draw object location on screen
drawObject(x,y,cameraFeed);}

}else putText(cameraFeed,"TOO MUCH NOISE! ADJUST FILTER",Point(0,50),1,2,Scalar(0,0,255),2);
}
}
int main(int argc, char* argv[])
{
//some boolean variables for different functionality within this
//program
bool trackObjects = false;
bool useMorphOps = false;
//Matrix to store each frame of the webcam feed
Mat cameraFeed;
//matrix storage for HSV image
Mat HSV;
//matrix storage for binary threshold image
Mat threshold;
//x and y values for the location of the object
int x=0, y=0;
//create slider bars for HSV filtering
createTrackbars();
//video capture object to acquire webcam feed
VideoCapture capture;
//open capture object at location zero (default location for webcam)
capture.open(0);
//set height and width of capture frame
capture.set(CV_CAP_PROP_FRAME_WIDTH,FRAME_WIDTH);
capture.set(CV_CAP_PROP_FRAME_HEIGHT,FRAME_HEIGHT);
//start an infinite loop where webcam feed is copied to cameraFeed matrix
//all of our operations will be performed within this loop
while(1){
//store image to matrix
capture.read(cameraFeed);
//convert frame from BGR to HSV colorspace
cvtColor(cameraFeed,HSV,COLOR_BGR2HSV);
//filter HSV image between values and store filtered image to
//threshold matrix
inRange(HSV,Scalar(H_MIN,S_MIN,V_MIN),Scalar(H_MAX,S_MAX,V_MAX),threshold);
//perform morphological operations on thresholded image to eliminate noise
//and emphasize the filtered object(s)
if(useMorphOps)
morphOps(threshold);
//pass in thresholded frame to our object tracking function
//this function will return the x and y coordinates of the
//filtered object
if(trackObjects)
trackFilteredObject(x,y,threshold,cameraFeed);

//show frames
imshow(windowName2,threshold);
imshow(windowName,cameraFeed);
imshow(windowName1,HSV);

//delay 30ms so that screen can refresh.
//image will not appear without this waitKey() command
waitKey(30);
}

return 0;
}
</pre>
<pre>


Posted in Computer Vision, OpenCV, OpenCV Tutorial | Tagged: , | Leave a Comment »

## Writing Video to a File

Posted by Hemprasad Y. Badgujar on January 19, 2015

</pre>
<pre></pre>
<pre></pre>
<pre>#include <opencv\highgui.h>
#include <opencv\cv.h>
#include

using namespace cv;
using namespace std;

string intToString(int number){

std::stringstream ss;
ss << number;
return ss.str();
}

int main(int argc, char* argv[])
{

VideoCapture cap(0); // open the video camera no. 0

if (!cap.isOpened())  // if not success, exit program
{
cout << "ERROR INITIALIZING VIDEO CAPTURE" << endl;
return -1;
}

char* windowName = "Webcam Feed";
namedWindow(windowName,CV_WINDOW_AUTOSIZE); //create a window to display our webcam feed

while (1) {

Mat frame;

bool bSuccess = cap.read(frame); // read a new frame from camera feed

if (!bSuccess) //test if frame successfully read
{
cout << "ERROR READING FRAME FROM CAMERA FEED" << endl;
break;
}

imshow(windowName, frame); //show the frame in "MyVideo" window

//listen for 10ms for a key to be pressed
switch(waitKey(10)){

case 27:
//'esc' has been pressed (ASCII value for 'esc' is 27)
//exit program.
return 0;

}

}

return 0;

}


</pre>
<pre>#include <opencv\highgui.h>
#include <opencv\cv.h>
#include

using namespace cv;
using namespace std;

string intToString(int number){

std::stringstream ss;
ss << number;
return ss.str();
}

int main(int argc, char* argv[])
{
bool recording = false;
bool startNewRecording = false;
int inc=0;
bool firstRun = true;

VideoCapture cap(0); // open the video camera no. 0
VideoWriter oVideoWriter;//create videoWriter object, not initialized yet

if (!cap.isOpened())  // if not success, exit program
{
cout << "ERROR: Cannot open the video file" << endl;
return -1;
}

namedWindow("MyVideo",CV_WINDOW_AUTOSIZE); //create a window called "MyVideo"

double dWidth = cap.get(CV_CAP_PROP_FRAME_WIDTH); //get the width of frames of the video
double dHeight = cap.get(CV_CAP_PROP_FRAME_HEIGHT); //get the height of frames of the video

cout << "Frame Size = " << dWidth << "x" << dHeight << endl;

//set framesize for use with videoWriter
Size frameSize(static_cast(dWidth), static_cast(dHeight));

while (1) {

if(startNewRecording==true){

oVideoWriter  = VideoWriter("D:/MyVideo"+intToString(inc)+".avi", CV_FOURCC('D', 'I', 'V', '3'), 20, frameSize, true); //initialize the VideoWriter object
//oVideoWriter  = VideoWriter("D:/MyVideo"+intToString(inc)+".avi", (int)cap.get(CV_CAP_PROP_FOURCC), 20, frameSize, true); //initialize the VideoWriter object

recording = true;
startNewRecording = false;
cout<<"New video file created D:/MyVideo"+intToString(inc)+".avi "<
<pre>

## People Detection

Posted by Hemprasad Y. Badgujar on January 19, 2015

# People Detection Sample from OpenCV

</pre>
<pre class="cpp">#include <opencv2/opencv.hpp>

using namespace cv;

int main (int argc, const char * argv[])
{
VideoCapture cap(0);
cap.set(CV_CAP_PROP_FRAME_WIDTH, 320);
cap.set(CV_CAP_PROP_FRAME_HEIGHT, 240);

if (!cap.isOpened())
return -1;

Mat img;
namedWindow("opencv", CV_WINDOW_AUTOSIZE);
HOGDescriptor hog;
hog.setSVMDetector(HOGDescriptor::getDefaultPeopleDetector());

while (true)
{
cap >> img;
if (img.empty())
continue;

vector found, found_filtered;
hog.detectMultiScale(img, found, 0, Size(8,8), Size(32,32), 1.05, 2);
size_t i, j;
for (i=0; i<found.size(); i++)="" {="" rect="" r="found[i];" for="" (j="0;" j<found.size();="" j++)="" if="" (j!="i" &&="" (r="" &="" found[j])="=" r)="" break;="" found.size())="" found_filtered.push_back(r);="" }=""  ="" (i="0;" i<found_filtered.size();="" r.x="" +="cvRound(r.width*0.1);" r.width="cvRound(r.width*0.8);" r.y="" r.height="cvRound(r.height*0.8);" rectangle(img,="" r.tl(),="" r.br(),="" scalar(0,255,0),="" 3);="" imshow("opencv",="" img);="" (waitkey(10)="">=0)
break;
}
return 0;
}</pre>
<pre>

Posted in Computer Vision, OpenCV, OpenCV, OpenCV Tutorial | Tagged: | Leave a Comment »

## Computer Vision source codes

Posted by Hemprasad Y. Badgujar on January 19, 2015

Feature Detection and Description

General Libraries:

• VLFeat – Implementation of various feature descriptors (including SIFT, HOG, and LBP) and covariant feature detectors (including DoG, Hessian, Harris Laplace, Hessian Laplace, Multiscale Hessian, Multiscale Harris). Easy-to-use Matlab interface. See Modern features: Software – Slides providing a demonstration of VLFeat and also links to other software. Check also VLFeat hands-on session training
• OpenCV – Various implementations of modern feature detectors and descriptors (SIFT, SURF, FAST, BRIEF, ORB, FREAK, etc.)

Fast Keypoint Detectors for Real-time Applications:

• FAST – High-speed corner detector implementation for a wide variety of platforms
• AGAST – Even faster than the FAST corner detector. A multi-scale version of this method is used for the BRISK descriptor (ECCV 2010).

Binary Descriptors for Real-Time Applications:

• BRIEF – C++ code for a fast and accurate interest point descriptor (not invariant to rotations and scale) (ECCV 2010)
• ORB – OpenCV implementation of the Oriented-Brief (ORB) descriptor (invariant to rotations, but not scale)
• BRISK – Efficient Binary descriptor invariant to rotations and scale. It includes a Matlab mex interface. (ICCV 2011)
• FREAK – Faster than BRISK (invariant to rotations and scale) (CVPR 2012)

SIFT and SURF Implementations:

Other Local Feature Detectors and Descriptors:

• VGG Affine Covariant features – Oxford code for various affine covariant feature detectors and descriptors.
• LIOP descriptor – Source code for the Local Intensity order Pattern (LIOP) descriptor (ICCV 2011).
• Local Symmetry Features – Source code for matching of local symmetry features under large variations in lighting, age, and rendering style (CVPR 2012).

Global Image Descriptors:

• GIST – Matlab code for the GIST descriptor
• CENTRIST – Global visual descriptor for scene categorization and object detection (PAMI 2011)

Feature Coding and Pooling

• VGG Feature Encoding Toolkit – Source code for various state-of-the-art feature encoding methods – including Standard hard encoding, Kernel codebook encoding, Locality-constrained linear encoding, and Fisher kernel encoding.
• Spatial Pyramid Matching – Source code for feature pooling based on spatial pyramid matching (widely used for image classification)

Convolutional Nets and Deep Learning

• Caffe – Fast C++ implementation of deep convolutional networks (GPU / CPU / ImageNet 2013 demonstration).
• OverFeat – C++ library for integrated classification and localization of objects.
• EBLearn – C++ Library for Energy-Based Learning. It includes several demos and step-by-step instructions to train classifiers based on convolutional neural networks.
• Torch7 – Provides a matlab-like environment for state-of-the-art machine learning algorithms, including a fast implementation of convolutional neural networks.
• Deep Learning – Various links for deep learning software.

Facial Feature Detection and Tracking

• IntraFace – Very accurate detection and tracking of facial features (C++/Matlab API).

Part-Based Models

Attributes and Semantic Features

Large-Scale Learning

• Additive Kernels – Source code for fast additive kernel SVM classifiers (PAMI 2013).
• LIBLINEAR – Library for large-scale linear SVM classification.
• VLFeat – Implementation for Pegasos SVM and Homogeneous Kernel map.

Fast Indexing and Image Retrieval

• FLANN – Library for performing fast approximate nearest neighbor.
• Kernelized LSH – Source code for Kernelized Locality-Sensitive Hashing (ICCV 2009).
• ITQ Binary codes – Code for generation of small binary codes using Iterative Quantization and other baselines such as Locality-Sensitive-Hashing (CVPR 2011).
• INRIA Image Retrieval – Efficient code for state-of-the-art large-scale image retrieval (CVPR 2011).

Object Detection

3D Recognition

Action Recognition

Datasets

Attributes

• Animals with Attributes – 30,475 images of 50 animals classes with 6 pre-extracted feature representations for each image.
• aYahoo and aPascal – Attribute annotations for images collected from Yahoo and Pascal VOC 2008.
• FaceTracer – 15,000 faces annotated with 10 attributes and fiducial points.
• PubFig – 58,797 face images of 200 people with 73 attribute classifier outputs.
• LFW – 13,233 face images of 5,749 people with 73 attribute classifier outputs.
• Human Attributes – 8,000 people with annotated attributes. Check also this link for another dataset of human attributes.
• SUN Attribute Database – Large-scale scene attribute database with a taxonomy of 102 attributes.
• ImageNet Attributes – Variety of attribute labels for the ImageNet dataset.
• Relative attributes – Data for OSR and a subset of PubFig datasets. Check also this link for the WhittleSearch data.
• Attribute Discovery Dataset – Images of shopping categories associated with textual descriptions.

Fine-grained Visual Categorization

Face Detection

• FDDB – UMass face detection dataset and benchmark (5,000+ faces)
• CMU/MIT – Classical face detection dataset.

Face Recognition

• Face Recognition Homepage – Large collection of face recognition datasets.
• LFW – UMass unconstrained face recognition dataset (13,000+ face images).
• NIST Face Homepage – includes face recognition grand challenge (FRGC), vendor tests (FRVT) and others.
• CMU Multi-PIE – contains more than 750,000 images of 337 people, with 15 different views and 19 lighting conditions.
• FERET – Classical face recognition dataset.
• Deng Cai’s face dataset in Matlab Format – Easy to use if you want play with simple face datasets including Yale, ORL, PIE, and Extended Yale B.
• SCFace – Low-resolution face dataset captured from surveillance cameras.

Handwritten Digits

• MNIST – large dataset containing a training set of 60,000 examples, and a test set of 10,000 examples.

Pedestrian Detection

Generic Object Recognition

• ImageNet – Currently the largest visual recognition dataset in terms of number of categories and images.
• Tiny Images – 80 million 32×32 low resolution images.
• Pascal VOC – One of the most influential visual recognition datasets.
• Caltech 101 / Caltech 256 – Popular image datasets containing 101 and 256 object categories, respectively.
• MIT LabelMe – Online annotation tool for building computer vision databases.

Scene Recognition

Feature Detection and Description

• VGG Affine Dataset – Widely used dataset for measuring performance of feature detection and description. Check VLBenchmarksfor an evaluation framework.

Action Recognition

RGBD Recognition

Posted by Hemprasad Y. Badgujar on January 19, 2015

In a previous article, I showed how face detection can be performed in MATLABusing OpenCV. In this article, I will combine this face detector with a Kalman filter to build a simple face tracker that can track a face in a video.

If you are unfamiliar with Kalman filters, I suggest you read up first on how alpha beta filters work. They are a simplified version of the Kalman filter that are much easier to understand, but still apply many of the core ideas of the Kalman filter.

## Face tracking without a Kalman filter

The OpenCV-based face detector can be applied to every frame to detect the location of the face. Because it may detect multiple faces, we need a method to find the relationship between a detected face in one frame to another face in the next frame — this is a combinatorial problem known as data association. The simplest method is the nearest neighbour approach, and some other methods can be found in this survey paper on object tracking. However, to greatly simplify the problem, the tracker I have implemented is a single face tracker and it assumes there is always a face in the frame. This means that every face that is detected can be assumed to be the same person’s face. If more than one face is detected, only the first face is used. If no faces are detected, a detection error is assumed. The MATLAB code below will detect the face location in a sequence of images and output the bounding box coordinates to a CSV file.

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 function detect_faces(imgDir, opencvPath, includePath, outputFilename)     % Load the required libraries     if libisloaded('highgui2410'), unloadlibrary highgui2410, end     if libisloaded('cv2410'), unloadlibrary cv2410, end     if libisloaded('cxcore2410'), unloadlibrary cxcore2410, end     loadlibrary(...         fullfile(opencvPath, 'bin\cxcore2410.dll'), @proto_cxcore);     loadlibrary(...         fullfile(opencvPath, 'bin\cv2410.dll'), ...         fullfile(opencvPath, 'cv\include\cv.h'), ...             'alias', 'cv2410', 'includepath', includePath);     loadlibrary(...         fullfile(opencvPath, 'bin\highgui2410.dll'), ...         fullfile(opencvPath, 'otherlibs\highgui\highgui.h'), ...             'alias', 'highgui2410', 'includepath', includePath);     % Load the cascade     classifierFilename = 'C:/Program Files/OpenCV/data/haarcascades/haarcascade_frontalface_alt.xml';     cvCascade = calllib('cv2410', 'cvLoadHaarClassifierCascade', classifierFilename, libstruct('CvSize',struct('width',int16(2410),'height',int16(2410))));     % Create memory storage     cvStorage = calllib('cxcore2410', 'cvCreateMemStorage', 0);     % Get the list of images     imageFiles = dir(imgDir);     detections = struct;     h = waitbar(0, 'Performing face detection...'); % progress bar     % Open the output CSV file     fid = fopen(outputFilename, 'w');     fprintf(fid, 'filename,x1,y1,x2,y2');     for i = 1:numel(imageFiles)         if imageFiles(i).isdir; continue; end         imageFile = fullfile(imgDir, imageFiles(i).name);         % Load the input image         cvImage = calllib('highgui2410', ...             'cvLoadImage', imageFile, int16(1));         if ~cvImage.Value.nSize             error('Image could not be loaded');         end         % Perform face detection         cvSeq = calllib('cv2410', 'cvHaarDetectObjects', cvImage, cvCascade, cvStorage, 1.1, 2, 0, libstruct('CvSize',struct('width',int16(40),'height',int16(40))));         % Save the detected bounding box, if any (and if there's multiple         % detections, just use the first one)         detections(i).filename = imageFile;         if cvSeq.Value.total == 1             cvRect = calllib('cxcore2410', ...                 'cvGetSeqElem', cvSeq, int16(1));             fprintf(fid, '%s,%d,%d,%d,%d', imageFile, ...                 cvRect.Value.x, cvRect.Value.y, ...                 cvRect.Value.x + cvRect.Value.width, ...                 cvRect.Value.y + cvRect.Value.height);         else             fprintf(fid, '%s,%d,%d,%d,%d', imageFile, 0, 0, 0, 0);         end         % Release image         calllib('cxcore2410', 'cvReleaseImage', cvImage);         waitbar(i / numel(imageFiles), h);     end     % Release resources     fclose(fid);     close(h);     calllib('cxcore2410', 'cvReleaseMemStorage', cvStorage);     calllib('cv2410', 'cvReleaseHaarClassifierCascade', cvCascade); end

We can then run our face detector and generate an output file, faces.csv, like this:

 1 2 3 4 imgDir = 'images'; opencvPath = 'C:\Program Files\OpenCV'; includePath = fullfile(opencvPath, 'cxcore\include'); detect_faces(imgDir, opencvPath, includePath, 'faces.csv');

In the video below, I have run this script on the FGnet Talking Face database (which is free to download) and displayed the bounding boxes overlayed on the image sequence. You can download a copy of the faces.csv file that was used to generate the video from here.

The bounding box roughly follows the face, but its trajectory is quite noisy and the video contains numerous frames where the bounding box disappears because the face was not detected. The Kalman filter can be used to smooth this trajectory and estimate the location of the bounding box when the face detector fails.

## Kalman filtering: The gritty details

The Kalman filter is a recursive two-stage filter. At each iteration, it performs a predictstep and an update step.

The predict step predicts the current location of the moving object based on previous observations. For instance, if an object is moving with constant acceleration, we can predict its current location, $\hat{\textbf{x}}_{t}$, based on its previous location, $\hat{\textbf{x}}_{t-1}$, using theequations of motion.

The update step takes the measurement of the object’s current location (if available), $\textbf{z}_{t}$, and combines this with the predicted current location, $\hat{\textbf{x}}_{t}$, to obtain an a posterioriestimated current location of the object, $\textbf{x}_{t}$.

The equations that govern the Kalman filter are given below (taken from theWikipedia article):

1. Predict stage:
1. Predicted (a priori) state: $\hat{\textbf{x}}_{t|t-1} = \textbf{F}_{t}\hat{\textbf{x}}_{t-1|t-1} + \textbf{B}_{t} \textbf{u}_{t}$
2. Predicted (a priori) estimate covariance: $\textbf{P}_{t|t-1} = \textbf{F}_{t} \textbf{P}_{t-1|t-1} \textbf{F}_{t}^{T}+ \textbf{Q}_{t}$
2. Update stage:
1. Innovation or measurement residual: $\tilde{\textbf{y}}_t = \textbf{z}_t - \textbf{H}_t\hat{\textbf{x}}_{t|t-1}$
2. Innovation (or residual) covariance: $\textbf{S}_t = \textbf{H}_t \textbf{P}_{t|t-1} \textbf{H}_t^T + \textbf{R}_t$
3. Optimal Kalman gain: $\textbf{K}_t = \textbf{P}_{t|t-1} \textbf{H}_t^T \textbf{S}_t^{-1}$
4. Updated (a posteriori) state estimate: $\hat{\textbf{x}}_{t|t} = \hat{\textbf{x}}_{t|t-1} + \textbf{K}_t\tilde{\textbf{y}}_t$
5. Updated (a posteriori) estimate covariance: $\textbf{P}_{t|t} = (I - \textbf{K}_t \textbf{H}_t) \textbf{P}_{t|t-1}$

They can be difficult to understand at first, so let’s first take a look at what each of these variables are used for:

• ${\mathbf{x}_{t}}$ is the current state vector, as estimated by the Kalman filter, at time ${t}$.
• ${\mathbf{z}_{t}}$ is the measurement vector taken at time ${t}$.
• ${\mathbf{P}_{t}}$ measures the estimated accuracy of ${\mathbf{x}_{t}}$ at time ${t}$.
• ${\mathbf{F}}$ describes how the system moves (ideally) from one state to the next, i.e. how one state vector is projected to the next, assuming no noise (e.g. no acceleration)
• ${\mathbf{H}}$ defines the mapping from the state vector, ${\mathbf{x}_{t}}$, to the measurement vector, ${\mathbf{z}_{t}}$.
• ${\mathbf{Q}}$ and ${\mathbf{R}}$ define the Gaussian process and measurement noise, respectively, and characterise the variance of the system.
• ${\mathbf{B}}$ and ${\mathbf{u}}$ are control-input parameters are only used in systems that have an input; these can be ignored in the case of an object tracker.

Note that in a simple system, the current state ${\mathbf{x}_{t}}$ and the measurement ${\mathbf{z}_{t}}$ will contain the same set of state variables (only ${\mathbf{x}_{t}}$ will be a filtered version of ${\mathbf{z}_{t}}$) and ${\mathbf{H}}$will be an identity matrix, but many real-world systems will include latent variablesthat are not directly measured. For example, if we are tracking the location of a car, we may be able to directly measure its location from a GPS device and its velocity from the speedometer, but not its acceleration.

In the predict stage, the state of the system and its error covariance are transitioned using the defined transition matrix ${\mathbf{F}}$, and can be implemented in MATLAB as:

 1 2 3 4 function [x,P] = kalman_predict(x,P,F,Q)     x = F*x; %predicted state     P = F*P*F' + Q; %predicted estimate covariance end

In the update stage, we first calculate the difference between our predicted and measured states. We then calculate the Kalman gain matrix, ${\mathbf{K}}$, which is used to weight between our predicted and measured states and is adjusted based on a ratio of prediction error ${\mathbf{P}_{t}}$ to measurement noise ${\mathbf{S}_{t}}$.

Finally, the state vector and its error covariance are then updated with the measured state. It can be implemented in MATLAB as:

 1 2 3 4 5 6 7 function [x,P] = kalman_update(x,P,z,H,R)     y = z - H*x; %measurement error/innovation     S = H*P*H' + R; %measurement error/innovation covariance     K = P*H'*inv(S); %optimal Kalman gain     x = x + K*y; %updated state estimate     P = (eye(size(x,1)) - K*H)*P; %updated estimate covariance end

Both the stages only update two variables: ${\mathbf{x}_{t}}$, the state variable, and ${\mathbf{P}_{t}}$, the prediction error covariance variable.

The two stages of the filter correspond to the state-space model typically used to model linear dynamical systems. The first stage solves the process equation:

$\displaystyle \mathbf{x}_{t+1}=\mathbf{F}\mathbf{x}_{k}+\mathbf{w}_{k}$

The process noise ${\mathbf{w}_{k}}$ is additive Gaussian white noise (AWGN)with zero mean and covariance defined by:

$\displaystyle E\left[\mathbf{w}_{t}\mathbf{w}_{t}^{T}\right]=\mathbf{Q}$

The second one is the measurement equation:

$\displaystyle \mathbf{z}_{t}=\mathbf{H}\mathbf{x}_{t}+\mathbf{v}_{t}$

The measurement noise ${\mathbf{v}_{t}}$ is also AGWN with zero mean and covariance defined by:

$\displaystyle E\left[\mathbf{v}_{t}\mathbf{v}_{t}^{T}\right]=\mathbf{R}$

## Defining the system

In order to implement a Kalman filter, we have to define several variables that model the system. We have to choose the variables contained by ${\mathbf{x}_{t}}$ and ${\mathbf{z}_{t}}$, and also choose suitable values for ${\mathbf{F}}$, ${\mathbf{H}}$, ${\mathbf{Q}}$ and ${\mathbf{R}}$, as well as an initial value for ${\mathbf{P}_{t}}$.

We will define our measurement vector as:

$\displaystyle \mathbf{z}_{t}=\left[\begin{array}{cccc} x_{1,t} & y_{1,t} & x_{2,t} & y_{2,t}\end{array}\right]^{T}$

where $\left(x_{1,t},\, y_{1,t}\right)$ and $\left(x_{2,t},\, y_{2,t}\right)$ are the upper-left and lower-right corners of the bounding box around the detected face, respectively. This is simply the output from the Viola and Jones face detector.

A logical choice for our state vector is:

$\displaystyle \mathbf{x}_{t}=\left[\begin{array}{cccccc} x_{1,t} & y_{1,t} & x_{2,t} & y_{2,t} & dx_{t} & dy_{t}\end{array}\right]^{T}$

where ${dx_{t}}$ and ${dy_{t}}$ are the first-order derivatives. Other vectors are also possible; for example, some papers introduce a “scale” variable, which assumes that the bounding box maintains a fixed aspect ratio.

The transition matrix ${\mathbf{F}}$ defines the equations used to transition from one state vector${\mathbf{x}_{t}}$ to the next vector ${\mathbf{x}_{t+1}}$ (without taking into account any measurements, ${\mathbf{z}_{t}}$). It is plugged in to the process equation:

$\displaystyle \mathbf{x}_{t+1}=\mathbf{F}\mathbf{x}_{k}+\mathbf{w}_{k}$

Let’s look at some basic equations describing motion:

\displaystyle \begin{aligned}x & =dx_{0}t+\frac{1}{2}d^{2}x\cdot\Delta T^{2}\\ dx & =dx_{0}+d^{2}x\cdot\Delta T\end{aligned}

We could express this system using the following recurrence:

\displaystyle \begin{aligned}x_{t+1} & =x_{t}+dx_{t}\cdot\Delta T+\frac{1}{2}d^{2}x_{t}\cdot\Delta T^{2}\\ dx_{t+1} & =dx_{t}+d^{2}x_{t}\cdot\Delta T\end{aligned}

These same equations can also be used to model the ${y_{t}}$ variables and their derivatives. Referring back to the process equation, we can thus model this system as:

$\displaystyle \left[\begin{array}{c} x_{1,t+1}\\ y_{1,t+1}\\ x_{2,t+1}\\ y_{2,t+1}\\ dx_{t+1}\\ dy_{t+1}\end{array}\right]=\left[\begin{array}{cccccc} 1 & 0 & 0 & 0 & 1 & 0\\ 0 & 1 & 0 & 0 & 0 & 1\\ 0 & 0 & 1 & 0 & 1 & 0\\ 0 & 0 & 0 & 1 & 0 & 1\\ 0 & 0 & 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 0 & 0 & 1\end{array}\right]\left[\begin{array}{c} x_{1,t}\\ y_{1,t}\\ x_{2,t}\\ y_{2,t}\\ dx_{t}\\ dy_{t}\end{array}\right]+\left[\begin{array}{c} d^{2}x_{t}/2\\ d^{2}y_{t}/2\\ d^{2}x_{t}/2\\ d^{2}y_{t}/2\\ d^{2}x_{t}\\ d^{2}y_{t}\end{array}\right]\times\Delta T$

The process noise matrix ${\mathbf{Q}}$ measures the variability of the input signal away from the “ideal” transitions defined in the transition matrix. Larger values in this matrix mean that the input signal has greater variance and the filter needs to be more adaptable. Smaller values result in a smoother output, but the filter is not as adaptable to large changes. This can be a little difficult to define, and may require some fine tuning. Based on our definition of the measurement noise ${\mathbf{v}_{t}}$ above, our process noise matrix is defined as:

\displaystyle \begin{aligned}\mathbf{Q} & =\left[\begin{array}{cccccc} \Delta T^{4}/4 & 0 & 0 & 0 & \Delta T^{3}/2 & 0\\ 0 & \Delta T^{4}/4 & 0 & 0 & 0 & \Delta T^{3}/2\\ 0 & 0 & \Delta T^{4}/4 & 0 & \Delta T^{3}/2 & 0\\ 0 & 0 & 0 & \Delta T^{4}/4 & 0 & \Delta T^{3}/2\\ \Delta T^{3}/2 & 0 & \Delta T^{3}/2 & 0 & \Delta T^{2} & 0\\ 0 & \Delta T^{3}/2 & 0 & \Delta T^{3}/2 & 0 & \Delta T^{2}\end{array}\right]\times a^{2}\\ & =\left[\begin{array}{cccccc} 1/4 & 0 & 0 & 0 & 1/2 & 0\\ 0 & 1/4 & 0 & 0 & 0 & 1/2\\ 0 & 0 & 1/4 & 0 & 1/2 & 0\\ 0 & 0 & 0 & 1/4 & 0 & 1/2\\ 1/2 & 0 & 1/2 & 0 & 1 & 0\\ 0 & 1/2 & 0 & 1/2 & 0 & 1\end{array}\right]\times10^{-2}\end{aligned}

where ${\Delta T=1}$ and ${a=d^{2}x_{t}=d^{2}y_{t}=0.1}$.

The measurement matrix ${\mathbf{H}}$ maps between our measurement vector ${\mathbf{z}_{t}}$ and state vector ${\mathbf{x}_{t}}$. It is plugged in to the measurement equation:

$\displaystyle \mathbf{z}_{t}=\mathbf{H}\mathbf{x}_{t}+\mathbf{v}_{t}$

The variables ${x_{t}}$ and ${y_{t}}$ are mapped directly from ${\mathbf{z}_{t}}$ to ${\mathbf{x}_{t}}$, whereas the derivative variables are latent (hidden) variables and so are not directly measured and are not included in the mapping. This gives us the measurement matrix:

$\displaystyle \mathbf{H}=\left[\begin{array}{cccccc} 1 & 0 & 0 & 0 & 0 & 0\\ 0 & 1 & 0 & 0 & 0 & 0\\ 0 & 0 & 1 & 0 & 0 & 0\\ 0 & 0 & 0 & 1 & 0 & 0\end{array}\right]$

The matrix ${\mathbf{R}}$ defines the error of the measuring device. For a physical instrument such as a speedometer or voltmeter, the measurement accuracy may be defined by the manufacturer. In the case of a face detector, we can determine the accuracy empirically. For instance, we may find that our Viola and Jones face detector detects faces to within 10 pixels of the actual face location 95% of the time. If we assume this error is Gaussian-distributed (which is a requirement of the Kalman filter), this gives us a variance of 6.5 pixels for each of the coordinates, so the measurement noise vector is then given by:

$\displaystyle \mathbf{v}=\left[\begin{array}{cccc} 6.5 & 6.5 & 6.5 & 6.5\end{array}\right]^{T}$

The errors are independent, so our covariance matrix is given by:

$\displaystyle \mathbf{R}=\left[\begin{array}{cccc} 6.5^{2} & 0 & 0 & 0\\ 0 & 6.5^{2} & 0 & 0\\ 0 & 0 & 6.5^{2} & 0\\ 0 & 0 & 0 & 6.5^{2}\end{array}\right]=\left[\begin{array}{cccc} 1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1\end{array}\right] \times 42.25$

Decreasing the values in ${\mathbf{R}}$ means we are optimistically assuming our measurements are more accurate, so the filter performs less smoothing and the predicted signal will follow the observed signal more closely. Conversely, increasing ${\mathbf{R}}$ means we have less confidence in the accuracy of the measurements, so more smoothing is performed.

The estimate covariance matrix ${\mathbf{P}}$ is a measure of the estimated accuracy of ${\mathbf{x}_{t}}$ at time ${t}$. It is adjusted over time by the filter, so we only need to supply a reasonable initial value. If we know for certain the exact state variable at start-up, then we can initialise ${\mathbf{P}}$ to a matrix of all zeros. Otherwise, it should be initialised as a diagonal matrix with a large value along the diagonal:

$\displaystyle \mathbf{P}=\left[\begin{array}{cccccc} 1 & 0 & 0 & 0 & 0 & 0\\ 0 & 1 & 0 & 0 & 0 & 0\\ 0 & 0 & 1 & 0 & 0 & 0\\ 0 & 0 & 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 0 & 0 & 1\end{array}\right]\times\epsilon$

where ${\epsilon\gg0}$. The filter will then prefer the information from the first few measurements over the information already in the model.

## Implementing the face tracker

The following script implements the system we have defined above. It loads the face detection results from CSV file, performs the Kalman filtering, and displays the detected bounding boxes.

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 % read in the detected face locations fid = fopen('detect_faces.csv'); fgetl(fid); %ignore the header detections = textscan(fid, '%[^,] %d %d %d %d', 'delimiter', ','); fclose(fid); % define the filter x = [ 0; 0; 0; 0; 0; 0 ]; F = [ 1 0 0 0 1 0 ; ...       0 1 0 0 0 1 ; ...       0 0 1 0 1 0 ; ...       0 0 0 1 0 1 ; ...       0 0 0 0 1 0 ; ...       0 0 0 0 0 1 ]; Q = [ 1/4  0   0   0  1/2  0  ; ...        0  1/4  0   0   0  1/2 ; ...        0   0  1/4  0  1/2  0  ; ...        0   0   0  1/4  0  1/2 ; ...       1/2  0  1/2  0   1   0  ; ...        0  1/2  0  1/2  0   1  ] * 1e-2; H = [ 1 0 0 0 0 0 ; ...       0 1 0 0 0 0 ; ...       0 0 1 0 0 0 ; ...       0 0 0 1 0 0 ]; R = eye(4) * 42.25; P = eye(6) * 1e4; nsamps = numel(detections{1}); for n = 1:nsamps     % read the next detected face location     meas_x1 = detections{2}(n);     meas_x2 = detections{4}(n);     meas_y1 = detections{3}(n);     meas_y2 = detections{5}(n);     z = double([meas_x1; meas_x2; meas_y1; meas_y2]);     % step 1: predict     [x,P] = kalman_predict(x,P,F,Q);     % step 2: update (if measurement exists)     if all(z > 0)         [x,P] = kalman_update(x,P,z,H,R);     end     % draw a bounding box around the detected face     img = imread(detections{1}{n});     imshow(img);     est_z = H*x;     est_x1 = est_z(1);     est_x2 = est_z(2);     est_y1 = est_z(3);     est_y2 = est_z(4);     if all(est_z > 0) && est_x2 > est_x1 && est_y2 > est_y1         rectangle('Position', [est_x1 est_y1 est_x2-est_x1 est_y2-est_y1], 'EdgeColor', 'g', 'LineWidth', 3);     end     drawnow; end

The results of running this script are shown in the following video:
Clearly we can see that this video has a much smoother and more accurate bounding box around the face than the unfiltered version shown previously, and the video no longer has frames with missing detections.

## Closing remarks

In the future, I aim to write an article on the extended Kalman filter (EKF) and unscented Kalman filter (UKF) (and the similar particle filter). These are both non-linear versions of the Kalman filter. Although face trackers are usually implemented using the linear Kalman filter, the non-linear versions have some other interesting applications in image and signal processing.

Posted in Computer Vision, OpenCV, OpenCV | Leave a Comment »

## OpenCV Viola & Jones object detection in MATLAB

Posted by Hemprasad Y. Badgujar on January 19, 2015

In image processing, one of the most successful object detectors devised is theViola and Jones detector, proposed in their seminal CVPR paper in 2001. A popular implementation used by image processing researchers and implementers is provided by the OpenCV library. In this post, I’ll show you how run the OpenCV object detector in MATLAB for Windows. You should have some familiarity with OpenCV and with the Viola and Jones detector to work through this tutorial.

### Steps in the object detector

MATLAB is able to call functions in shared libraries. This means that, using the compiled OpenCV DLLs, we are able to directly call various OpenCV functions from within MATLAB. The flow of our MATLAB program, including the required OpenCV external function calls (based on this example), will go something like this:

1. cvLoadHaarClassifierCascade: Load object detector cascade
2. cvCreateMemStorage: Allocate memory for detector
3. cvLoadImage: Load image from disk
4. cvHaarDetectObjects: Perform object detection
5. For each detected object:
1. cvGetSeqElem: Get next detected object of type cvRect
2. Display this detection result in MATLAB
6. cvReleaseImage: Unload the image from memory
7. cvReleaseMemStorage: De-allocate memory for detector
8. cvReleaseHaarClassifierCascade: Unload the cascade from memory

### Loading shared libraries

The first step is to load the OpenCV shared libraries using MATLAB’sloadlibrary() function. To use the functions listed in the object detector steps above, we need to load the OpenCV libraries cxcore2410.dll, cv2410.dll andhighgui2410.dll. Assuming that OpenCV has been installed to "C:\Program Files\OpenCV", the libraries are loaded like this:

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 opencvPath = 'C:\Program Files\OpenCV'; includePath = fullfile(opencvPath, 'cxcore\include'); loadlibrary(...     fullfile(opencvPath, 'bin\cxcore2410.dll'), ...     fullfile(opencvPath, 'cxcore\include\cxcore.h'), ...         'alias', 'cxcore2410', 'includepath', includePath); loadlibrary(...     fullfile(opencvPath, 'bin\cv2410.dll'), ...     fullfile(opencvPath, 'cv\include\cv.h'), ...         'alias', 'cv2410', 'includepath', includePath); loadlibrary(...     fullfile(opencvPath, 'bin\highgui2410.dll'), ...     fullfile(opencvPath, 'otherlibs\highgui\highgui.h'), ...         'alias', 'highgui2410', 'includepath', includePath);

You will get some warnings; these can be ignored for our purposes. You can display the list of functions that a particular shared library exports with thelibfunctions() command in MATLAB For example, to list the functions exported by the highgui library:

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 >> libfunctions('highgui2410') Functions in library highgui2410: cvConvertImage             cvQueryFrame cvCreateCameraCapture      cvReleaseCapture cvCreateFileCapture        cvReleaseVideoWriter cvCreateTrackbar           cvResizeWindow cvCreateVideoWriter        cvRetrieveFrame cvDestroyAllWindows        cvSaveImage cvDestroyWindow            cvSetCaptureProperty cvGetCaptureProperty       cvSetMouseCallback cvGetTrackbarPos           cvSetPostprocessFuncWin32 cvGetWindowHandle          cvSetPreprocessFuncWin32 cvGetWindowName            cvSetTrackbarPos cvGrabFrame                cvShowImage cvInitSystem               cvStartWindowThread cvLoadImage                cvWaitKey cvLoadImageM               cvWriteFrame cvMoveWindow cvNamedWindow

The first step in our object detector is to load a detector cascade. We are going to load one of the frontal face detector cascades that is provided with a normal OpenCV installation:

 1 2 3 classifierFilename = 'C:/Program Files/OpenCV/data/haarcascades/haarcascade_frontalface_alt.xml'; cvCascade = calllib('cv2410', 'cvLoadHaarClassifierCascade', classifierFilename, ...     libstruct('CvSize',struct('width',int16(2410),'height',int16(2410))));

The function calllib() returns a libpointer structure containing two fairly self-explanatory fields, DataType and Value. To display the return value fromcvLoadHaarClassifierCascade(), we can run:

 1 2 3 4 5 6 7 8 9 10 11 >> cvCascade.Value ans =                flags: 1.1125e+009                count: 22     orig_window_size: [1x1 struct]     real_window_size: [1x1 struct]                scale: 0     stage_classifier: [1x1 struct]          hid_cascade: []

The above output shows that MATLAB has successfully loaded the cascade file and returned a pointer to an OpenCV CvHaarClassifierCascade object.

### Prototype M-files

We could now continue implementing all of our OpenCV function calls from the object detector steps like this, however we will run into a problem when cvGetSeqElem is called. To see why, try this:

libfunctions('cxcore2410', '-full')

The -full option lists the signatures for each imported function. The signature for the function cvGetSeqElem() is listed as:

[cstring, CvSeqPtr] cvGetSeqElem(CvSeqPtr, int32)

This shows that the return value for the imported cvGetSeqElem() function will be a pointer to a character (cstring). This is based on the function declaration in thecxcore.h header file:

CVAPI(char*)  cvGetSeqElem( const CvSeq* seq, int index );

However, in step 5.1 of our object detector steps, we require a CvRect object. Normally in C++ you would simply cast the character pointer return value to aCvRect object, but MATLAB does not support casting of return values fromcalllib(), so there is no way we can cast this to a CvRect.

The solution is what is referred to as a prototype M-file. By constructing a prototype M-file, we can define our own signatures for the imported functions rather than using the declarations from the C++ header file.

Let’s generate the prototype M-file now:

 1 2 3 4 loadlibrary(...     fullfile(opencvPath, 'bin\cxcore2410.dll'), ...     fullfile(opencvPath, 'cxcore\include\cxcore.h'), ...         'mfilename', 'proto_cxcore');

This will automatically generate a prototype M-file named proto_cxcore.m based on the C++ header file. Open this file up and find the function signature forcvGetSeqElem and replace it with the following:

 1 2 3 % char * cvGetSeqElem ( const CvSeq * seq , int index ); fcns.name{fcnNum}='cvGetSeqElem'; fcns.calltype{fcnNum}='cdecl'; fcns.LHS{fcnNum}='CvRectPtr'; fcns.RHS{fcnNum}={'CvSeqPtr', 'int32'};fcnNum=fcnNum+1;

This changes the return type for cvGetSeqElem() from a char pointer to aCvRect pointer.

We can now load the library using the new prototype:

 1 2 3 loadlibrary(...     fullfile(opencvPath, 'bin\cxcore2410.dll'), ...         @proto_cxcore);

### An example face detector

We now have all the pieces ready to write a complete object detector. The code listing below implements the object detector steps listed above to perform face detection on an image. Additionally, the image is displayed in MATLAB and a box is drawn around any detected faces.

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 opencvPath = 'C:\Program Files\OpenCV'; includePath = fullfile(opencvPath, 'cxcore\include'); inputImage = 'lenna.jpg'; %% Load the required libraries if libisloaded('highgui2410'), unloadlibrary highgui2410, end if libisloaded('cv2410'), unloadlibrary cv2410, end if libisloaded('cxcore2410'), unloadlibrary cxcore2410, end loadlibrary(...     fullfile(opencvPath, 'bin\cxcore2410.dll'), @proto_cxcore); loadlibrary(...     fullfile(opencvPath, 'bin\cv2410.dll'), ...     fullfile(opencvPath, 'cv\include\cv.h'), ...         'alias', 'cv2410', 'includepath', includePath); loadlibrary(...     fullfile(opencvPath, 'bin\highgui2410.dll'), ...     fullfile(opencvPath, 'otherlibs\highgui\highgui.h'), ...         'alias', 'highgui2410', 'includepath', includePath); %% Load the cascade classifierFilename = 'C:/Program Files/OpenCV/data/haarcascades/haarcascade_frontalface_alt.xml'; cvCascade = calllib('cv2410', 'cvLoadHaarClassifierCascade', classifierFilename, ...     libstruct('CvSize',struct('width',int16(2410),'height',int16(2410)))); %% Create memory storage cvStorage = calllib('cxcore2410', 'cvCreateMemStorage', 0); %% Load the input image cvImage = calllib('highgui2410', ...     'cvLoadImage', inputImage, int16(1)); if ~cvImage.Value.nSize     error('Image could not be loaded'); end %% Perform object detection cvSeq = calllib('cv2410', ...     'cvHaarDetectObjects', cvImage, cvCascade, cvStorage, 1.1, 2, 0, ...     libstruct('CvSize',struct('width',int16(40),'height',int16(40)))); %% Loop through the detections and display bounding boxes imshow(imread(inputImage)); %load and display image in MATLAB for n = 1:cvSeq.Value.total     cvRect = calllib('cxcore2410', ...         'cvGetSeqElem', cvSeq, int16(n));     rectangle('Position', ...         [cvRect.Value.x cvRect.Value.y ...         cvRect.Value.width cvRect.Value.height], ...         'EdgeColor', 'r', 'LineWidth', 3); end %% Release resources calllib('cxcore2410', 'cvReleaseImage', cvImage); calllib('cxcore2410', 'cvReleaseMemStorage', cvStorage); calllib('cv2410', 'cvReleaseHaarClassifierCascade', cvCascade);

As an example, the following is the output after running the detector above on a greyscale version of the Lenna test image:

Note: If you get a segmentation fault attempting to run the code above, tryevaluating the cells one-by-one (e.g. by pressing Ctrl-Enter) – it seems to fix the problem.

Posted in Computer Vision, OpenCV, OpenCV, OpenCV Tutorial | Tagged: , | Leave a Comment »

Extracts from a Personal Diary

dedicated to the life of a silent girl who eventually learnt to open up

Num3ri v 2.0

I miei numeri - seconda versione

ThuyDX

Just another WordPress.com site

Algunos Intereses de Abraham Zamudio Chauca

Matematica, Linux , Programacion Serial , Programacion Paralela (CPU - GPU) , Cluster de Computadores , Software Cientifico

josephdung

thoughts...

Tech_Raj

A great WordPress.com site

Travel tips

Travel tips

Experience the real life.....!!!

Shurwaat achi honi chahiye ...

Ronzii's Blog

Just your average geek's blog

Karan Jitendra Thakkar

Everything I think. Everything I do. Right here.

Chetan Solanki

Helpful to u, if u need it.....

ScreenCrush

Explorer of Research #HEMBAD

managedCUDA

Explorer of Research #HEMBAD

siddheshsathe

A great WordPress.com site

Ari's

This is My Space so Dont Mess With IT !!