Something More for Research

Explorer of Research #HEMBAD

Posts Tagged ‘OpenCV’

Installing OpenCV 3.0 and Python 3.4 on Windows

Posted by Hemprasad Y. Badgujar on May 6, 2016

Installing OpenCV 3.0 and Python 3.4  on Windows

I recently decided to update to the newest OpenCV version (3.0) and wanted to add Python 3.4+ along with it. It took me a couple hours to set up because I couldn’t find a good tutorial on how to install them on Windows. So, here’s my attempt at a tutorial based on what I have just been through. Hope it helps.

For the rest of this post, I will show you how to compile and install OpenCV 3.0 with Python 3.4+ bindings on Windows.

Step 1

In order to use python 3.4+ with openCV, the first step is to build our own version of openCV using CMake and Visual Studio (I’m using Visual Studio 2013 Express for Desktop), since the prebuilt binaries in the openCV website includes python 2.7 libraries and not the 3.4+ libraries. So, if you have not done so, install these applications below:

Step 2

To use python with openCV, aside from installing the core python packages, you also need to install Numpy (a python array and matrices library). You can install them separately from their own websites. However, I like to use python packages from third-parties, specifically Anaconda. These packages give you all the common python libraries bundled with the core python packages. This way, you install everything in a single install. So, the next step is to download and install python+numpy using Anaconda. Download the source from the link below and just install with the recommended install settings.


It’s good to use virtual environments with python installation just so you can have several versions in one machine for different types of applications you’ll be developing. However, I wont get into this here.

Step 3

Next, we need to download the openCV source. Remember, we need to build a custom openCV build from the source and will not use the prebuilt binaries available for download from the openCV website. There are a couple of ways to download the source, and both of them involve the openCV GitHub webpage.

The easiest way to download the source is to download a zip file containing the contents of the openCV GitHub page. This can be done by pressing the Download Zip button towards the right side of the page. Once your done, extract the contents of the zip file to a folder, for convenience, name it opencv.


If you want to receive updated versions of openCV as they are made by the contributors, you can also clone the source using Git. If you’re familiar with git then you can just use the clone URL as shown in the above image or fork a version of the code for yourself. If you are just getting started with Git, you can use the Clone in Desktop button to copy the updated version of openCV to your local machine. For this to work, you will be asked to download the GitHub Desktop application (you can just follow the instructions from GitHub on how to install this application).

Step 4

Now that all the tools we need to build our very own openCV have been installed, we will start building our openCV binaries. To start the process, create a new folder called build inside your openCV directory (the directory you unzipped/cloned the openCV source to).

We use CMake, the application installed in Step 1, to build the openCV binaries from its source code. So, open CMake-gui from the Start Menu. Now, near the top of the CMake window chose the location of your source code (the openCV directory) and choose the location to build the binaries in (the build folder you just created). I chose to put my openCV directory in C:/opencv, so the settings for me would look like this:


Now, the next thing to do is to configure your build by clicking the Configure button in the CMake window. A pop-up that prompts you to select a compiler will show, choose Visual Studio 12 2013 or the VS version you have installed on your machine. Once chosen, click finish and the configuration process will start. Once its done, the status window should say Configuring done like below:


Once the configuration is complete, you will receive fields marked in red in the above display window. If your result is just a long list of fields, make sure to check the Grouped checkbox so that they are nicely grouped like the image below.

First, the WITH field describes the features you want to include inside your openCV binaries. You can include or exclude a feature by using the checkbox in the list. The defaults should be fine, by this is all up to you. If you want to know what each field does, just hover over them and an explanation will pop up.


Next, you need to configure the BUILD field. The BUILD field configures the build method used to build the binaries and also the modules that are to be build into the binaries. The fields in ALL CAPS are the build methods and the rest are the modules to be built. You can keep the methods as is. This is also the case for the modules except for one thing. Since we are building for python 3.4+ we do not need to build for python 2+, therefore it is necessary to uncheck the BUILD_opencv_python2 checkbox. If left checked, this will cause an error during the build process if you do not have python 2+ installed in your machine.


Now, that everything is configured, click the Generate button to create the build filesinside the build folder.

Step 5

Once the build files have been generated by CMake, the next step is to create the binaries using Visual Studio as the compiler. First, go to the your opencv/build directory, then find and open the openCV solution (opencv.sln). Once the solution is open, you should get a solution explorer that looks something like this


Before we build anything, change the build mode to Release instead of Debug. Now,right-click on the Solution ‘OpenCV’ or on ALL_BUILD and select Build. This will start the build process and may take some time.

Once the build is complete, right-click on INSTALL to install openCV-Python on your machine.

Step 6

Once the installation is complete, we need to verify the installation by using python IDLE. Just search for IDLE in the Start Menu and run the program. Type import cv2 in the command line and hit Enter. If no error is found then congratulations, you have just successfully built and installed openCV 3.0 with python 3.4+ bindings on Windows.


Additional Notes:

  • You can also check the openCV version you have installed with python by using the cv2.__version__ command (as shown above).
  • One error I did receive was that the openCV dll could not be found when calling the command import cv2. This can be solved by adding the Release folder created during the build process to your system path (for me, it was C:\opencv\build\bin\Release)
  • In order to code python in Visual Studio, you need to use PTVS (Python Tools for Visual Studio). You can download PTVS from Microsoft’s PTVS page here.

Posted in OpenCV | Tagged: , , | Leave a Comment »

OpenCV: Color-spaces and splitting channels

Posted by Hemprasad Y. Badgujar on July 18, 2015

Conversion between color-spaces

Our goal here is to visualize each of the three channels of these color-spaces: RGB, HSV, YCrCb and Lab. In general, none of them are absolute color-spaces and the last three (HSV, YCrCb and Lab) are ways of encoding RGB information. Our images will be read in BGR (Blue-Green-Red), because of OpenCV defaults. For each of these color-spaces there is a mapping function and they can be found at OpenCV cvtColor documentation.
One important point is: OpenCV imshow() function will always assume that the Mat shown is in BGR color-space. Which means, we will always need to convert back to see what we want. Let’s start.

OpenCV Program: Split Channels (356 downloads )


While in BGR, an image is treated as an additive result of three base colors (blue, green and red), HSV stands for Hue, Saturation and Value (Brightness). We can say that HSV is a rearrangement of RGB in a cylindrical shape. The HSV ranges are:

  • 0 > H > 360 ⇒ OpenCV range = H/2 (0 > H > 180)
  • 0 > S > 1 ⇒ OpenCV range = 255*S (0 > S > 255)
  • 0 > V > 1 ⇒ OpenCV range = 255*V (0 > V > 255)

YCrCb or YCbCr

It is used widely in video and image compression schemes. The YCrCb stands for Luminance (sometimes you can see Y’ as luma), Red-difference and Blue-difference chroma components. The YCrCb ranges are:

  • 0 > Y > 255
  • 0 > Cr > 255
  • 0 > Cb > 255


In this color-opponent space, L stands for the Luminance dimension, while a and b are the color-opponent dimensions. The Lab ranges are:

  • 0 > L > 100 ⇒ OpenCV range = L*255/100 (1 > L > 255)
  • -127 > a > 127 ⇒ OpenCV range = a + 128 (1 > a > 255)
  • -127 > b > 127 ⇒ OpenCV range = b + 128 (1 > b > 255)

Splitting channels

All the color-spaces mentioned above were constructed using three channels (dimensions). It is a good exercise to visualize each of these channels and realize what they really store, because when I say that the third channel of HSV stores the brightness, what do you expect to see? Remember: a colored image is made of three-channels (in our cases) and when we see each of them separately, what do you think the output will be? If you said a grayscale image, you are correct! However, you might have seen these channels as colored images out there. So, how? For that, we need to choose a fixed value for the other two channels. Let’s do this!
To visualize each channel with color, I used the same values used on the Slides 53 to 65 from CS143, Lecture 03 from Brown University.


Original image (a) and its channels with color: blue (b), green (c) and red (d). On the second row, each channel in grayscale (single channel image), respectively.


Original image (a) and its channels with color: hue (b), saturation (c) and value or brightness (d). On the second row, each channel in grayscale (single channel image), respectively.

YCrCb or YCbCr

Original image (a) and its channels with color: luminance (b), red-difference (c) and blue difference (d). On the second row, each channel in grayscale (single channel image), respectively.

Lab or CIE Lab

Original image (a) and its channels with color: luminance (b), a-dimension (c) and b-dimension (d). On the second row, each channel in grayscale (single channel image), respectively.

Posted in Computer Vision, GPU (CUDA), OpenCV, OpenCV, OpenCV Tutorial | Tagged: , , | Leave a Comment »

OpenCV CUDA Sample Program

Posted by Hemprasad Y. Badgujar on July 17, 2015

Design considerations

OpenCV GPU module is written using CUDA, therefore it benefits from the CUDA ecosystem. There is a large community, conferences, publications, many tools and libraries developed such as NVIDIA NPP, CUFFT, Thrust.

The GPU module is designed as host API extension. This design provides the user an explicit control on how data is moved between CPU and GPU memory. Although the user has to write some additional code to start using the GPU, this approach is both flexible and allows more efficient computations.

GPU modules includes class cv::gpu::GpuMat which is a primary container for data kept in GPU memory. It’s interface is very similar with cv::Mat, its CPU counterpart. All GPU functions receive GpuMat as input and output arguments. This allows to invoke several GPU algorithms without downloading data. GPU module API interface is also kept similar with CPU interface where possible. So developers who are familiar with Opencv on CPU could start using GPU straightaway.

Short sample

In the sample below an image is loaded from png0file, next it is uploaded to GPU, thresholded, downloaded and displayed.

#include <iostream>
#include "opencv2/opencv.hpp"
#include "opencv2/gpu/gpu.hpp"

int main (int argc, char* argv[])
        cv::Mat src_host = cv::imread("file.png", CV_LOAD_IMAGE_GRAYSCALE);
        cv::gpu::GpuMat dst, src;

        cv::gpu::threshold(src, dst, 128.0, 255.0, CV_THRESH_BINARY);

        cv::Mat result_host = dst;
        cv::imshow("Result", result_host);
    catch(const cv::Exception& ex)
        std::cout << "Error: " << ex.what() << std::endl;
    return 0;

Posted in Mixed | Tagged: , , , | Leave a Comment »

Assessing the pixel values of an image

Posted by Hemprasad Y. Badgujar on March 14, 2015

Assessing the pixel values of an image

#include "opencv2/core/core.hpp"
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include "iostream"
using namespace cv;
using namespace std;
int main( )
 Mat src1;
 src1 = imread("lena.jpg", CV_LOAD_IMAGE_COLOR); 
 namedWindow( "Original image", CV_WINDOW_AUTOSIZE ); 
 imshow( "Original image", src1 ); 
 Mat gray;
 cvtColor(src1, gray, CV_BGR2GRAY);
 namedWindow( "Gray image", CV_WINDOW_AUTOSIZE ); 
 imshow( "Gray image", gray );
 // know the number of channels the image has
 cout<<"original image channels: "<<src1.channels()<<"gray image="" channels:="" "<="" *******************="" read="" the="" pixel="" intensity="" *********************="" single="" channel="" grey="" scale="" (type="" 8uc1)="" and="" coordinates="" x="5" y="2" by="" convention,="" {row="" number="y}" {column="" intensity.val[0]="" contains="" a="" value="" from="" 0="" to="" 255="" scalar="" intensity1="," 5);="" cout="" <<="" "intensity=" << endl << " "="" intensity1.val[0]="" endl="" endl;="" 3="" with="" bgr="" color="" 8uc3)="" values="" can="" be="" stored="" in="" "int"="" or="" "uchar".="" here="" int="" is="" used.="" vec3b="" intensity2=",15);" blue="intensity2.val[0];" green="intensity2.val[1];" red="intensity2.val[2];" write="" **********************="" this="" an="" example="" opencv="""" documentation="" mat="" h(10,="" 10,="" cv_64f);="" for(int="" i="0;" <="" h.rows;="" i++)="" j="0;" h.cols;="" j++)="",j)="1./(i+j+1);" cout<<h<<endl<<endl;="" modify="" pixels="" of="" for="" (int="" {="" j<src1.cols;=""<vec3b="">(i,j)[0] = 0;,j)[1] = 200;,j)[2] = 0; 
 namedWindow( "Modify pixel", CV_WINDOW_AUTOSIZE ); 
 imshow( "Modify pixel", src1 );
 return 0;

Posted in OpenCV, OpenCV Tutorial | Tagged: , , , , | Leave a Comment »

How to get started with wxWidgets on Windows

Posted by Hemprasad Y. Badgujar on February 3, 2015

How to get started with wxWidgets on Windows

wxWidgets is a cross-platform GUI library, that is also available for Windows. You can get started with using wxWidgets in a few steps:

  1. Download and install the Windows installer for the current stable release of wxWidgets from its download page. It installs the source and build files in C:. For example, inC:\wxWidgets-3.0.2\
  2. wxWidgets needs to be built before it can be used with your application. Go toC:\wxWidgets-3.0.2\build\msw and open the .sln file that matches the Visual Studio version you intend to use for your application. For example, I open wx_vc10.sln using Visual Studio 2012.
  3. Choose one of the build types: Debug, Release, DLL Debug or DLL Release and build the solution. The resulting .lib files are placed in C:\wxWidgets-3.0.2\lib\vc_lib
  4. Create a new Visual Studio solution for your C++ application. Remember that it has to be Win32 Project, not a Win32 Console Project. The difference is that the main function is defined inside wxWidgets and does not need to be defined in your application code.
  5. Add a .cpp file to your solution and copy the Hello World code into it.
  6. Add C:\wxWidgets-3.0.2\include and C:\wxWidgets-3.0.2\include\msvc as additional include directories to the solution.
  7. Add C:\wxWidgets-3.0.2\lib\vc_lib as additional library directory to the solution.
  8. Build the solution and run it to see an empty wxWidgets window.

Posted in Computer Vision, Entertainment, Free Tools, My Research Related, OpenCV | Tagged: , , , , , , , | Leave a Comment »

Professional ways of tracking GPU memory leakage

Posted by Hemprasad Y. Badgujar on January 25, 2015

Depending on what I am doing and what I need to track/trace and profile I utilise all 4 packages above. They also have the added benefit of being a: free; b: well maintained; c: free; d: regularly updated; e: free.

In case you hadn’t guessed I like the free part:)

In regards of object management, I would recommend an old C++ coding principle: as soon as you create an object, add the line that deletes it, every new should always (eventually) have a delete. That way you know that you are destroying the objects you create, however it will not save you from orphaned memory block memory leaks, where you change where pointers are pointing, for example:

myclass* firstInstance = new myclass();
myclass* secondInstance = new myclass();
firstInstance = secondInstance;
delete firstInstance;
delete secondInstance;

You will now have created a small memory leak where the data for the real firstInstance is now not being pointed at by any pointer. Very hard to detect when this happens in a large code-base, and more common that it should be.

generally these are the pairings you need to be aware of to ensure you properly dispose of all your objects:

new -> delete
new[] -> delete[]
malloc() -> free() // or you can use realloc(0) instead of free()
calloc() -> free() // or you can use realloc(0) instead of free()
realloc(nonzero) -> free() // or you can use realloc(0) instead of free()

If you are coming from a language with garbage collection to C++ it can take a while to get used to, but it quickly becomes habit:)

Posted in C, Computer Languages, Computer Vision, Computing Technology, CUDA | Tagged: , , , , , | Leave a Comment »

Detect and Track Objects

Posted by Hemprasad Y. Badgujar on January 20, 2015

Detect and Track Objects With OpenCV

In the following, I made an overview of tutorials and guides to getting strted how to use OpenCV for detection and tracking objects. OpenCV is a library for computer visions designed for analyze, process, and understand the objects from images aiming to produce information.

  • OpenCV Tutorials – comprehensive list with basic OpenCV tutorials and source code based on the OpenCV library;
  • Object Detection & Tracking Using Color – example of application where OpenCV is used to detect objects based on color differences;
  • Face Detection Using OpenCV – guide how to use OpenCV to detect one or more faces from the same image;
  • SURF in OpenCV – tutorial how to use SURF algorithm designed to detect key-points and descriptors in images;
  • Introduction to Face Detection and Face Recognition – face detection and recognition are two of the most common applications in computer vision from robotics, and this tutorial present the steps how a face is detected and recognized from images;
  • Find Objects with a Webcam – using a simple webcam mounted on a robot and the Simple Qt interface designed to work with OpenCV, as you can see in this tutorial any object can be detected and tracked in images;
  • Features 2D + Homography to Find a Known Object – tutorial with programming code and explanation in order to use two important functions included in OpenCV. These two functions – findHomography and perspectiveTransform – are used to find objects in images. The findHomography is a function based on a technique called Key-point Matching, while the perspectiveTransform is an advanced class capable of mapping the points from an image;
  • Back Projection – tutorial based on calcBackProject function designed to calculate the back project of the histogram;
  • Tracking Colored Objects in OpenCV – tutorial for detection and tracking the colored objects from images using the OpenCV library;
  • OpenCV Tutorials – Based on “Learning OpenCV – Computer Vision with the OpenCV Library” – in order to be familiar with computer vision concepts, these tutorials can be useful for beginner and advanced users to start building applications or to improve the skills;
  • Image Processing on Pandaboard using OpenCV and Kinect – in this presentation you can find information about image processing with a Pandaboard single board computer using the Kinect sensor and the OpenCV library;
  • Video Capture using OpenCV with VC++ – OpenCV library can be integrated with Visual Studio and this article explain you as a programmer how to use the Visual C++ together with OpenCV;

Tutorials for Detecting and Tracking Objects with Mobile Devices

Mobile devices such as smartphones and tablets with iOS or Android operating systems can be integrated into robots and used to detect and track objects. Below is an overview of tutorials with comprehensive information for tracking objects using different mobile devices.

 Particle filter based trackers

  • Particle Filter Color Tracker [Link 1]
    • Matlab and c/c++ code.
    • Key words: region tracker, color histogram, ellipsoidal region, particle filter, SIR resampling.
  • Region Tracker based on a color Particle Filter [Link 1] [Example]
    • Matlab and c/c++ code.
    • Key words: region tracker, color histogram, ellipsoidal region, particle filter, SIR resampling.
  • Region Tracker based on an intensity Particle Filter [Link]
    • Matlab and c/c++ code.
    • Key words: region tracker, intensity histogram, ellipsoidal region, particle filter, SIR resampling.
  • Particle Filter Object Tracking [Link]
    • C/C++.

Mean shift based trackers

  • Scale and Orientation Adaptive Mean Shift Tracking. [Link]
    • Matlab.
  • Robust Mean Shift  Tracking with Corrected Background-Weighted Histogram. [Link]
    • Matlab.
  • Robust Object Tracking using Joint Color-Texture Histogram. [Link]
    • Matlab.
  • Tracking moving video objects using mean-shift algorithm. [Link]
    • Matlab.
  • Mean-shift Based Moving Object Tracker [Link]
    • C/C++.
  • Mean-Shift Video Tracking [Link]
    • Matlab.
  • Gray scale mean shift algorithm for tracking. [Link]
    • Matlab.
  • Mean shift tracking algorithm for tracking [Link]
    • Matlab.

Deformable/articulable object trackers

  • Visual Tracking with Integral Histograms and Articulating Blocks [Link]
    • Matlab and c/c++ code
    • Key words: region tracker, intensity histogram, multi-rectangular regions, integral histogram, exhaustive search, graph cut segmentation.

Appearance learning based trackers

  • Robust Object Tracking with Online Multiple Instance Learning. [Link]
    • C/C++.
  • Visual Tracking via Adaptive Structural Local Sparse Appearance Model. [Link]
    • Matlab.
  • Online Discriminative Object Tracking with Local Sparse Representation. [Link]
    • Matlab
  • Superpixel Tracking. [Link]
    • Matlab.
  • Online Multiple Support Instance Tracking. [Link]
    • Matlab.
  • Incremental Learning for Robust Visual Tracking. [Link]
    • Matlab.
  • Tracking with Online Multiple Instance Learning (MILTrack). [Link]
    • C/C++, OpenCV
  • Predator. [Link]
    • Matlab.
  • Object Tracking via Partial Least Squares Analysis. [Link]
    • Matlab.
  • Robust Object Tracking via Sparsity-based Collaborative Model. [Link]
    • Matlab.
  • On-line boosting trackers. [Link]
    • C/C++.

Advanced appearance model based trackers

  • Real-Time Compressive Tracking [Link]


Below is a list with resources including OpenCV documentation, libraries, and OpenCV compatible tools.

Posted in Computer Vision, OpenCV, OpenCV, OpenCV Tutorial | Tagged: , , , | Leave a Comment »

Computer Vision source codes

Posted by Hemprasad Y. Badgujar on January 19, 2015

Feature Detection and Description

General Libraries:

  • VLFeat – Implementation of various feature descriptors (including SIFT, HOG, and LBP) and covariant feature detectors (including DoG, Hessian, Harris Laplace, Hessian Laplace, Multiscale Hessian, Multiscale Harris). Easy-to-use Matlab interface. See Modern features: Software – Slides providing a demonstration of VLFeat and also links to other software. Check also VLFeat hands-on session training
  • OpenCV – Various implementations of modern feature detectors and descriptors (SIFT, SURF, FAST, BRIEF, ORB, FREAK, etc.)

Fast Keypoint Detectors for Real-time Applications:

  • FAST – High-speed corner detector implementation for a wide variety of platforms
  • AGAST – Even faster than the FAST corner detector. A multi-scale version of this method is used for the BRISK descriptor (ECCV 2010).

Binary Descriptors for Real-Time Applications:

  • BRIEF – C++ code for a fast and accurate interest point descriptor (not invariant to rotations and scale) (ECCV 2010)
  • ORB – OpenCV implementation of the Oriented-Brief (ORB) descriptor (invariant to rotations, but not scale)
  • BRISK – Efficient Binary descriptor invariant to rotations and scale. It includes a Matlab mex interface. (ICCV 2011)
  • FREAK – Faster than BRISK (invariant to rotations and scale) (CVPR 2012)

SIFT and SURF Implementations:

Other Local Feature Detectors and Descriptors:

  • VGG Affine Covariant features – Oxford code for various affine covariant feature detectors and descriptors.
  • LIOP descriptor – Source code for the Local Intensity order Pattern (LIOP) descriptor (ICCV 2011).
  • Local Symmetry Features – Source code for matching of local symmetry features under large variations in lighting, age, and rendering style (CVPR 2012).

Global Image Descriptors:

  • GIST – Matlab code for the GIST descriptor
  • CENTRIST – Global visual descriptor for scene categorization and object detection (PAMI 2011)

Feature Coding and Pooling

  • VGG Feature Encoding Toolkit – Source code for various state-of-the-art feature encoding methods – including Standard hard encoding, Kernel codebook encoding, Locality-constrained linear encoding, and Fisher kernel encoding.
  • Spatial Pyramid Matching – Source code for feature pooling based on spatial pyramid matching (widely used for image classification)

Convolutional Nets and Deep Learning

  • Caffe – Fast C++ implementation of deep convolutional networks (GPU / CPU / ImageNet 2013 demonstration).
  • OverFeat – C++ library for integrated classification and localization of objects.
  • EBLearn – C++ Library for Energy-Based Learning. It includes several demos and step-by-step instructions to train classifiers based on convolutional neural networks.
  • Torch7 – Provides a matlab-like environment for state-of-the-art machine learning algorithms, including a fast implementation of convolutional neural networks.
  • Deep Learning – Various links for deep learning software.

Facial Feature Detection and Tracking

  • IntraFace – Very accurate detection and tracking of facial features (C++/Matlab API).

Part-Based Models

Attributes and Semantic Features

Large-Scale Learning

  • Additive Kernels – Source code for fast additive kernel SVM classifiers (PAMI 2013).
  • LIBLINEAR – Library for large-scale linear SVM classification.
  • VLFeat – Implementation for Pegasos SVM and Homogeneous Kernel map.

Fast Indexing and Image Retrieval

  • FLANN – Library for performing fast approximate nearest neighbor.
  • Kernelized LSH – Source code for Kernelized Locality-Sensitive Hashing (ICCV 2009).
  • ITQ Binary codes – Code for generation of small binary codes using Iterative Quantization and other baselines such as Locality-Sensitive-Hashing (CVPR 2011).
  • INRIA Image Retrieval – Efficient code for state-of-the-art large-scale image retrieval (CVPR 2011).

Object Detection

3D Recognition

Action Recognition





  • Animals with Attributes – 30,475 images of 50 animals classes with 6 pre-extracted feature representations for each image.
  • aYahoo and aPascal – Attribute annotations for images collected from Yahoo and Pascal VOC 2008.
  • FaceTracer – 15,000 faces annotated with 10 attributes and fiducial points.
  • PubFig – 58,797 face images of 200 people with 73 attribute classifier outputs.
  • LFW – 13,233 face images of 5,749 people with 73 attribute classifier outputs.
  • Human Attributes – 8,000 people with annotated attributes. Check also this link for another dataset of human attributes.
  • SUN Attribute Database – Large-scale scene attribute database with a taxonomy of 102 attributes.
  • ImageNet Attributes – Variety of attribute labels for the ImageNet dataset.
  • Relative attributes – Data for OSR and a subset of PubFig datasets. Check also this link for the WhittleSearch data.
  • Attribute Discovery Dataset – Images of shopping categories associated with textual descriptions.

Fine-grained Visual Categorization

Face Detection

  • FDDB – UMass face detection dataset and benchmark (5,000+ faces)
  • CMU/MIT – Classical face detection dataset.

Face Recognition

  • Face Recognition Homepage – Large collection of face recognition datasets.
  • LFW – UMass unconstrained face recognition dataset (13,000+ face images).
  • NIST Face Homepage – includes face recognition grand challenge (FRGC), vendor tests (FRVT) and others.
  • CMU Multi-PIE – contains more than 750,000 images of 337 people, with 15 different views and 19 lighting conditions.
  • FERET – Classical face recognition dataset.
  • Deng Cai’s face dataset in Matlab Format – Easy to use if you want play with simple face datasets including Yale, ORL, PIE, and Extended Yale B.
  • SCFace – Low-resolution face dataset captured from surveillance cameras.

Handwritten Digits

  • MNIST – large dataset containing a training set of 60,000 examples, and a test set of 10,000 examples.

Pedestrian Detection

Generic Object Recognition

  • ImageNet – Currently the largest visual recognition dataset in terms of number of categories and images.
  • Tiny Images – 80 million 32×32 low resolution images.
  • Pascal VOC – One of the most influential visual recognition datasets.
  • Caltech 101 / Caltech 256 – Popular image datasets containing 101 and 256 object categories, respectively.
  • MIT LabelMe – Online annotation tool for building computer vision databases.

Scene Recognition

Feature Detection and Description

  • VGG Affine Dataset – Widely used dataset for measuring performance of feature detection and description. Check VLBenchmarksfor an evaluation framework.

Action Recognition

RGBD Recognition


Posted in Computer Vision, OpenCV, OpenCV | Tagged: , , , , , , , , , | Leave a Comment »

OpenCV Viola & Jones object detection in MATLAB

Posted by Hemprasad Y. Badgujar on January 19, 2015

In image processing, one of the most successful object detectors devised is theViola and Jones detector, proposed in their seminal CVPR paper in 2001. A popular implementation used by image processing researchers and implementers is provided by the OpenCV library. In this post, I’ll show you how run the OpenCV object detector in MATLAB for Windows. You should have some familiarity with OpenCV and with the Viola and Jones detector to work through this tutorial.

Steps in the object detector

MATLAB is able to call functions in shared libraries. This means that, using the compiled OpenCV DLLs, we are able to directly call various OpenCV functions from within MATLAB. The flow of our MATLAB program, including the required OpenCV external function calls (based on this example), will go something like this:

  1. cvLoadHaarClassifierCascade: Load object detector cascade
  2. cvCreateMemStorage: Allocate memory for detector
  3. cvLoadImage: Load image from disk
  4. cvHaarDetectObjects: Perform object detection
  5. For each detected object:
    1. cvGetSeqElem: Get next detected object of type cvRect
    2. Display this detection result in MATLAB
  6. cvReleaseImage: Unload the image from memory
  7. cvReleaseMemStorage: De-allocate memory for detector
  8. cvReleaseHaarClassifierCascade: Unload the cascade from memory

Loading shared libraries

The first step is to load the OpenCV shared libraries using MATLAB’sloadlibrary() function. To use the functions listed in the object detector steps above, we need to load the OpenCV libraries cxcore2410.dll, cv2410.dll andhighgui2410.dll. Assuming that OpenCV has been installed to "C:\Program Files\OpenCV", the libraries are loaded like this:

opencvPath = 'C:\Program Files\OpenCV';
includePath = fullfile(opencvPath, 'cxcore\include');
    fullfile(opencvPath, 'bin\cxcore2410.dll'), ...
    fullfile(opencvPath, 'cxcore\include\cxcore.h'), ...
        'alias', 'cxcore2410', 'includepath', includePath);
    fullfile(opencvPath, 'bin\cv2410.dll'), ...
    fullfile(opencvPath, 'cv\include\cv.h'), ...
        'alias', 'cv2410', 'includepath', includePath);
    fullfile(opencvPath, 'bin\highgui2410.dll'), ...
    fullfile(opencvPath, 'otherlibs\highgui\highgui.h'), ...
        'alias', 'highgui2410', 'includepath', includePath);

You will get some warnings; these can be ignored for our purposes. You can display the list of functions that a particular shared library exports with thelibfunctions() command in MATLAB For example, to list the functions exported by the highgui library:

>> libfunctions('highgui2410')
Functions in library highgui2410:
cvConvertImage             cvQueryFrame
cvCreateCameraCapture      cvReleaseCapture
cvCreateFileCapture        cvReleaseVideoWriter
cvCreateTrackbar           cvResizeWindow
cvCreateVideoWriter        cvRetrieveFrame
cvDestroyAllWindows        cvSaveImage
cvDestroyWindow            cvSetCaptureProperty
cvGetCaptureProperty       cvSetMouseCallback
cvGetTrackbarPos           cvSetPostprocessFuncWin32
cvGetWindowHandle          cvSetPreprocessFuncWin32
cvGetWindowName            cvSetTrackbarPos
cvGrabFrame                cvShowImage
cvInitSystem               cvStartWindowThread
cvLoadImage                cvWaitKey
cvLoadImageM               cvWriteFrame

The first step in our object detector is to load a detector cascade. We are going to load one of the frontal face detector cascades that is provided with a normal OpenCV installation:

classifierFilename = 'C:/Program Files/OpenCV/data/haarcascades/haarcascade_frontalface_alt.xml';
cvCascade = calllib('cv2410', 'cvLoadHaarClassifierCascade', classifierFilename, ...

The function calllib() returns a libpointer structure containing two fairly self-explanatory fields, DataType and Value. To display the return value fromcvLoadHaarClassifierCascade(), we can run:

>> cvCascade.Value
ans =
               flags: 1.1125e+009
               count: 22
    orig_window_size: [1x1 struct]
    real_window_size: [1x1 struct]
               scale: 0
    stage_classifier: [1x1 struct]
         hid_cascade: []

The above output shows that MATLAB has successfully loaded the cascade file and returned a pointer to an OpenCV CvHaarClassifierCascade object.

Prototype M-files

We could now continue implementing all of our OpenCV function calls from the object detector steps like this, however we will run into a problem when cvGetSeqElem is called. To see why, try this:

libfunctions('cxcore2410', '-full')

The -full option lists the signatures for each imported function. The signature for the function cvGetSeqElem() is listed as:

[cstring, CvSeqPtr] cvGetSeqElem(CvSeqPtr, int32)

This shows that the return value for the imported cvGetSeqElem() function will be a pointer to a character (cstring). This is based on the function declaration in thecxcore.h header file:

CVAPI(char*)  cvGetSeqElem( const CvSeq* seq, int index );

However, in step 5.1 of our object detector steps, we require a CvRect object. Normally in C++ you would simply cast the character pointer return value to aCvRect object, but MATLAB does not support casting of return values fromcalllib(), so there is no way we can cast this to a CvRect.

The solution is what is referred to as a prototype M-file. By constructing a prototype M-file, we can define our own signatures for the imported functions rather than using the declarations from the C++ header file.

Let’s generate the prototype M-file now:

    fullfile(opencvPath, 'bin\cxcore2410.dll'), ...
    fullfile(opencvPath, 'cxcore\include\cxcore.h'), ...
        'mfilename', 'proto_cxcore');

This will automatically generate a prototype M-file named proto_cxcore.m based on the C++ header file. Open this file up and find the function signature forcvGetSeqElem and replace it with the following:

% char * cvGetSeqElem ( const CvSeq * seq , int index );{fcnNum}='cvGetSeqElem'; fcns.calltype{fcnNum}='cdecl'; fcns.LHS{fcnNum}='CvRectPtr'; fcns.RHS{fcnNum}={'CvSeqPtr', 'int32'};fcnNum=fcnNum+1;

This changes the return type for cvGetSeqElem() from a char pointer to aCvRect pointer.

We can now load the library using the new prototype:

    fullfile(opencvPath, 'bin\cxcore2410.dll'), ...

An example face detector

We now have all the pieces ready to write a complete object detector. The code listing below implements the object detector steps listed above to perform face detection on an image. Additionally, the image is displayed in MATLAB and a box is drawn around any detected faces.

opencvPath = 'C:\Program Files\OpenCV';
includePath = fullfile(opencvPath, 'cxcore\include');
inputImage = 'lenna.jpg';
%% Load the required libraries
if libisloaded('highgui2410'), unloadlibrary highgui2410, end
if libisloaded('cv2410'), unloadlibrary cv2410, end
if libisloaded('cxcore2410'), unloadlibrary cxcore2410, end
    fullfile(opencvPath, 'bin\cxcore2410.dll'), @proto_cxcore);
    fullfile(opencvPath, 'bin\cv2410.dll'), ...
    fullfile(opencvPath, 'cv\include\cv.h'), ...
        'alias', 'cv2410', 'includepath', includePath);
    fullfile(opencvPath, 'bin\highgui2410.dll'), ...
    fullfile(opencvPath, 'otherlibs\highgui\highgui.h'), ...
        'alias', 'highgui2410', 'includepath', includePath);
%% Load the cascade
classifierFilename = 'C:/Program Files/OpenCV/data/haarcascades/haarcascade_frontalface_alt.xml';
cvCascade = calllib('cv2410', 'cvLoadHaarClassifierCascade', classifierFilename, ...
%% Create memory storage
cvStorage = calllib('cxcore2410', 'cvCreateMemStorage', 0);
%% Load the input image
cvImage = calllib('highgui2410', ...
    'cvLoadImage', inputImage, int16(1));
if ~cvImage.Value.nSize
    error('Image could not be loaded');
%% Perform object detection
cvSeq = calllib('cv2410', ...
    'cvHaarDetectObjects', cvImage, cvCascade, cvStorage, 1.1, 2, 0, ...
%% Loop through the detections and display bounding boxes
imshow(imread(inputImage)); %load and display image in MATLAB
for n =
    cvRect = calllib('cxcore2410', ...
        'cvGetSeqElem', cvSeq, int16(n));
    rectangle('Position', ...
        [cvRect.Value.x cvRect.Value.y ...
        cvRect.Value.width cvRect.Value.height], ...
        'EdgeColor', 'r', 'LineWidth', 3);
%% Release resources
calllib('cxcore2410', 'cvReleaseImage', cvImage);
calllib('cxcore2410', 'cvReleaseMemStorage', cvStorage);
calllib('cv2410', 'cvReleaseHaarClassifierCascade', cvCascade);

As an example, the following is the output after running the detector above on a greyscale version of the Lenna test image:

Note: If you get a segmentation fault attempting to run the code above, tryevaluating the cells one-by-one (e.g. by pressing Ctrl-Enter) – it seems to fix the problem.

Posted in Computer Vision, OpenCV, OpenCV, OpenCV Tutorial | Tagged: , | Leave a Comment »

Computer Vision Algorithm Implementations

Posted by Hemprasad Y. Badgujar on May 6, 2014

Participate in Reproducible Research

General Image Processing

(C/C++ code, BSD lic) Image manipulation, matrix manipulation, transforms
(C/C++ code, BSD lic) Basic image processing, matrix manipulation and feature extraction algorithms: rotation, flip, photometric normalisations (Histogram Equalization, Multiscale Retinex, Self-Quotient Image or Gross-Brajovic), edge detection, 2D DCT, 2D FFT, 2D Gabor, PCA to do Eigen-Faces, LDA to do Fisher-Faces. Various metrics (Euclidean, Mahanalobis, ChiSquare, NormalizeCorrelation, TangentDistance, …)
(C/C++ code, MIT lic) A Free Experimental System for Image Processing (loading, transforms, filters, histogram, morphology, …)
(C/C++ code, GPL and LGPL lic) CImg Library is an open source C++ toolkit for image processing
Generic Image Library (GIL)boost integration
(C/C++ code, MIT lic) Adobe open source C++ Generic Image Library (GIL)
SimpleCV a kinder, gentler machine vision library
(python code, MIT lic) SimpleCV is a Python interface to several powerful open source computer vision libraries in a single convenient package
PCL, The Point Cloud Library
(C/C++ code, BSD lic) The Point Cloud Library (or PCL) is a large scale, open project for point cloud processing. The PCL framework contains numerous state-of-the art algorithms including filtering, feature estimation, surface reconstruction, registration, model fitting and segmentation.
Population, imaging library in C++ for processing, analysing, modelling and visualising
(C/C++ code, CeCill lic) Population is an open-source imaging library in C++ for processing, analysing, modelling and visualising including more than 200 algorithms designed by V. Tariel.
(C/C++ code, LGPL 3) A computer vision framework based on Qt and OpenCV that provides an easy to use interface to display, analyze and run computer vision algorithms. The library is provided with multiple application examples including stereo, SURF, Sobel and and Hough transform.
Machine Vision Toolbox
(MATLAB/C, LGPL lic) image processing, segmentation, blob/line/point features, multiview geometry, camera models, colorimetry.
(Java code, Apache lic) BoofCV is an open source Java library for real-time computer vision and robotics applications. BoofCV is organized into several packages: image processing, features, geometric vision, calibration, visualize, and IO.
(C++ code, MIT lic) Simd is free open source library in C++. It includes high performance image processing algorithms. The algorithms are optimized with using of SIMD CPU extensions such as SSE2, SSSE3, SSE4.2 and AVX2.
Free but not open source – ArrayFire (formely LibJacket) is a matrix library for CUDA
(CUDA/C++, free lic) ArrayFire offers hundreds of general matrix and image processing functions, all running on the GPU. The syntax is very Matlab-like, with the goal of offering easy porting of Matlab code to C++/ArrayFire.

Image Acquisition, Decoding & encoding

(C/C++ code, LGPL or GPL lic) Record, convert and stream audio and video (lot of codec)
(C/C++ code, BSD lic) PNG, JPEG,… images, avi video files, USB webcam,…
(C/C++ code, BSD lic) Video file decoding/encoding (ffmpeg integration), image capture from a frame grabber or from USB, Sony pan/tilt/zoom camera control using VISCA interface
lib VLC
(C/C++ code, GPL lic) Used by VLC player: record, convert and stream audio and video
(C/C++ code, LGPL lic) RTSP streams
(C/C++ code, GPL lic) Loading & saving DPX, EXR, GIF, JPEG, JPEG-2000, PDF, PhotoCD, PNG, Postscript, SVG, TIFF, and more
(C/C++ code, LGPL lic) Loading & saving various image format
(C/C++ code, GPL & FPL lic) PNG, BMP, JPEG, TIFF loading
(C/C++ code, LGPL lic) VideoMan is trying to make the image capturing process from cameras, video files or image sequences easier.


(C/C++ code, BSD lic) Pyramid image segmentation
(C/C++ code, Microsoft Research Lic) Branch-and-Mincut Algorithm for Image Segmentation
Efficiently solving multi-label MRFs (Readme)
(C/C++ code) Segmentation, object category labelling, stereo

Machine Learning

(C/C++ code, BSD lic) Gradient machines ( multi-layered perceptrons, radial basis functions, mixtures of experts, convolutional networks and even time-delay neural networks), Support vector machines, Ensemble models (bagging, adaboost), Non-parametric models (K-nearest-neighbors, Parzen regression and Parzen density estimator), distributions (Kmeans, Gaussian mixture models, hidden Markov models, input-output hidden Markov models, and Bayes classifier), speech recognition tools

Object Detection

(C/C++ code, BSD lic) Viola-jones face detection (Haar features)
(C/C++ code, BSD lic) MLP & cascade of Haar-like classifiers face detection
Hough Forests
(C/C++ code, Microsoft Research Lic) Class-Specific Hough Forests for Object Detection
Efficient Subwindow Object Detection
(C/C++ code, Apache Lic) Christoph Lampert “Efficient Subwindow” algorithms for Object Detection
INRIA Object Detection and Localization Toolkit
(C/C++ code, Custom Lic) Histograms of Oriented Gradients library for Object Detection

Object Category Labelling

Efficiently solving multi-label MRFs (Readme)
(C/C++ code) Segmentation, object category labelling, stereo
Multi-label optimization
(C/C++/MATLAB code) The gco-v3.0 library is for optimizing multi-label energies. It supports energies with any combination of unary, pairwise, and label cost terms.

Optical flow

(C/C++ code, BSD lic) Horn & Schunck algorithm, Lucas & Kanade algorithm, Lucas-Kanade optical flow in pyramids, block matching.
(C/C++/OpenGL/Cg code, LGPL) Gain-Adaptive KLT Tracking and TV-L1 optical flow on the GPU.
(C/C++/Matlab code, Custom Lic.) The RLOF library provides GPU / CPU implementation of Optical Flow and Feature Tracking method.

Features Extraction & Matching

SIFT by R. Hess
(C/C++ code, GPL lic) SIFT feature extraction & RANSAC matching
(C/C++ code) SURF feature extraction algorihtm (kind of fast SIFT)
(C/C++ code, Ecole Polytechnique and ENS Cachan for commercial Lic) Affine SIFT (ASIFT)
VLFeat (formely Sift++)
(C/C++ code) SIFT, MSER, k-means, hierarchical k-means, agglomerative information bottleneck, and quick shift
A GPU Implementation of Scale Invariant Feature Transform (SIFT)
(C/C++ code, GPL lic) An enhance version of RANSAC that considers the correlation between data points

Nearest Neighbors matching

(C/C++ code, BSD lic) Approximate Nearest Neighbors (Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration)
(C/C++ code, LGPL lic) Approximate Nearest Neighbor Searching


(C/C++ code, BSD lic) Kalman, Condensation, CAMSHIFT, Mean shift, Snakes
KLT: An Implementation of the Kanade-Lucas-Tomasi Feature Tracker
(C/C++ code, public domain) Kanade-Lucas-Tomasi Feature Tracker
(C/C++/OpenGL/Cg code, ) A GPU-based Implementation of the Kanade-Lucas-Tomasi Feature Tracker
(C/C++/OpenGL/Cg code, LGPL) Gain-Adaptive KLT Tracking and TV-L1 optical flow on the GPU
On-line boosting trackers
(C/C++, LGPL) On-line boosting tracker, semi-supervised tracker, beyond semi-supervised tracker
Single Camera background subtraction tracking
(C/C++, LGPL) Background subtraction based tracking algorithm using OpenCV.
Multi-camera tracking
(C/C++, LGPL) Multi-camera particle filter tracking algorithm using OpenCv and intel IPP.

Simultaneous localization and mapping

Real-Time SLAM – SceneLib
(C/C++ code, LGPL lic) Real-time vision-based SLAM with a single camera
(C/C++ code, Isis Innovation Limited lic) Parallel Tracking and Mapping for Small AR Workspaces
(C/C++ code, BSD lic) GTSAM is a library of C++ classes that implement smoothing and mapping (SAM) in robotics and vision, using factor graphs and Bayes networks as the underlying computing paradigm rather than sparse matrices

Camera Calibration & constraint

(C/C++ code, BSD lic) Chessboard calibration, calibration with rig or pattern
Geometric camera constraint – Minimal Problems in Computer Vision
Minimal problems in computer vision arise when computing geometrical models from image data. They often lead to solving systems of algebraic equations.
Camera Calibration Toolbox for Matlab
(Matlab toolbox) Camera Calibration Toolbox for Matlab by Jean-Yves Bouguet (C implementation in OpenCV)

Multi-View Reconstruction

Bundle Adjustment – SBA
(C/C++ code, GPL lic) A Generic Sparse Bundle Adjustment Package Based on the Levenberg-Marquardt Algorithm
Bundle Adjustment – SSBA
(C/C++ code, LGPL lic) Simple Sparse Bundle Adjustment (SSBA)


Efficiently solving multi-label MRFs (Readme)
(C/C++ code) Segmentation, object category labelling, stereo
LIBELAS: Library for Efficient LArge-scale Stereo Matching
(C/C++ code) Disparity maps, stereo

Structure from motion

(C/C++ code, GPL lic) A structure-from-motion system for unordered image collections
Patch-based Multi-view Stereo Software (Windows version)
(C/C++ code, GPL lic) A multi-view stereo software that takes a set of images and camera parameters, then reconstructs 3D structure of an object or a scene visible in the images
libmv – work in progress
(C/C++ code, MIT lic) A structure from motion library
Multicore Bundle Adjustment
(C/C++/GPU code, GPL3 lic) Design and implementation of new inexact Newton type Bundle Adjustment algorithms that exploit hardware parallelism for efficiently solving large scale 3D scene reconstruction problems.
(C/C++/GPU code, MPL2 lic) OpenMVG (Multiple View Geometry) “open Multiple View Geometry” is a library for computer-vision scientists and especially targeted to the Multiple View Geometry community. It is designed to provide an easy access to the classical problem solvers in Multiple View Geometry and solve them accurately..

Visual odometry

LIBVISO2: Library for VISual Odometry 2
(C/C++ code, Matlab, GPL lic) Libviso 2 is a very fast cross-platfrom (Linux, Windows) C++ library with MATLAB wrappers for computing the 6 DOF motion of a moving mono/stereo camera.

Posted in Apps Development, C, Computer Hardware, Computer Network & Security, CUDA, Game Development, GPU (CUDA), GPU Accelareted, Graphics Cards, Image Processing, OpenCV, PARALLEL, Simulation, Virtualization | Tagged: , , , , , , , , , , , , , , , , , , , | 3 Comments »

Extracts from a Personal Diary

dedicated to the life of a silent girl who eventually learnt to open up

Num3ri v 2.0

I miei numeri - seconda versione


Just another site

Algunos Intereses de Abraham Zamudio Chauca

Matematica, Linux , Programacion Serial , Programacion Paralela (CPU - GPU) , Cluster de Computadores , Software Cientifico




A great site

Travel tips

Travel tips

Experience the real life.....!!!

Shurwaat achi honi chahiye ...

Ronzii's Blog

Just your average geek's blog

Karan Jitendra Thakkar

Everything I think. Everything I do. Right here.


News About Tech, Money and Innovation

Chetan Solanki

Helpful to u, if u need it.....


Explorer of Research #HEMBAD


Explorer of Research #HEMBAD


A great site


This is My Space so Dont Mess With IT !!

%d bloggers like this: