Something More for Research

Explorer of Research #HEMBAD

OpenCV on CUDA Performance Comparison

Posted by Hemprasad Y. Badgujar on July 17, 2015


Performance Table

For where i work, i had to test a lot of different GPU’s with come OpenCV GPU CUDA C++ functions, the GPU’s will end up in some rackservers. Anyhow, i tested some functions. You can press the figure to get a link to the performance table that is hosted with Google Docs.

Models tested

Computing Model Cuda V. Cores Frequency [MHz] Speedup (avg.)
Intel i2600K ~ 4 3400 1
Intel Xeon E5620 ~ 4 2400 0.68x
NVIDIA GTX 560 ASUS 2.1 336 810 22.4x
NVIDIA GTX 570 EVGA 2.1 480 810 31.9x
NVIDIA GTX 670 EVGA 3.0 1344 950 34.96x
NVIDIA GTX 680 EVGA 3.0 1536 1058 34.90x

*I used Ubuntu 12.04, CUDA 4.2, Opencv 2.4 C++ (latest svn snapshot), NVIDIA 295.51 driver.

Functions tested

matchTemplate, minMaxLoc, remap, dft, cornerHarris, integral, norm, meanShift, BruteForceMatcher, magnitude, add, log, mulSpectrums, resize, cvtColor, erode, threshold, pow, projectPoints, solvePnPRansac, GaussianBlur, filter2D, pyrDown, pyrUp, equializeHist, reduce.

Graphs Per Function

Test Setup

Conclusion

In terms of value for money, the GTX 670 (€400) with 2Gb of RAM is very nice. There is absolutely no reason to buy the GTX 680 since it costs € 100 more. Then again, the GTX 570 costs €300, which is nice, but only has 1,25Gb RAM, which can be dangerous when working with large images (nasty errors).
It is clear that GPU computation is BLOODY fast. But i HAVE to note, that only a SINGLE core of the CPU’s were used for the normal CPU functions. These algo’s have not really been optimized for multithreaded if I’m not mistaken. On the other hand, speed increases of >20x is too much for any intel CPU to catch up with. GPU Computing is a must if fast image processing is important.

GPU + GPU = Multi GPU

Multi GPU? Yes! Using 2xGTX670’s, you can use 2688 CORES. That means that if you don’t keep your GPU’s on a leash it might become self aware. You have been warned.
Oh yes, MULTI GPU! OpenCV only natively supports 1 GPU per function, but ofcourse you can use more if you want. OpenCV themselves suggest Intel’s TBB (thread building blocks) for some reason. OpenCV once started with OpenMP (open source parallel/multithread processing), but do not support that any more. Luckily, If you know your way around OpenMP, it is quite easy to implement.
You can use more GPU’s in OpenCV, there are some functions wich you can use with it. I tend to use OpenMP, make a simple parallel loop with some conditions, and within the thread just use the “gpu::setDevice” C++ function to set which device to use within that thread. For example, when you have two GPU’s, it is a good idea to let OpenMP set “num_threads(2)”, so each GPU has got its own thread, and with the setDevice function, you just use ‘gpu::setDevice(omp_get_thread)’ for example. I got a speed increase of 40~80% using 2 GPU’s, see the nice setup i had in my desktop where i tried it. It will eventually end up in the rackserver, purely for GPU computation, for which they are ideal.

Code for Multi-GPU in OpenCV with OpenMP

  1. bool useMGPU=true;
  2. bool useMP=true;
  3. int numGPUs=gpu::getCudaEnabledDeviceCount();
  4. omp_set_nested(1); //Turn on nested MP (to use parallel loops in your loop)
  5. #pragma omp parallel if (useMP) num_threads(2)
  6. {
  7. #pragma omp for
  8. for (int i=0;i<10;i++){
  9.    //If Multiple GPU support is on, assign based on threadnr
  10.    int threadID = omp_get_thread_num();
  11.    if (useMGPU && numGPUs>1){
  12.        cout << "Setting GPU#" << threadID << " for i#" << i << endl;
  13.        gpu::setDevice(threadID);
  14.    }
  15.    //Your GPU code here. The device has been set
  16. 
    
  17.    //..
  18. 
    
  19.    //Test to see if the GPU has been properly set throughout the loop (device should be == threadID)
  20.    if (useMGPU){
  21.        cout << " Had set GPU#" << gpu::getDevice() << " with tID#" << threadID << " (i#" << i << ")" << endl;
  22.    }
  23. }
  24. }
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
Extracts from a Personal Diary

dedicated to the life of a silent girl who eventually learnt to open up

Num3ri v 2.0

I miei numeri - seconda versione

ThuyDX

Just another WordPress.com site

Algunos Intereses de Abraham Zamudio Chauca

Matematica, Linux , Programacion Serial , Programacion Paralela (CPU - GPU) , Cluster de Computadores , Software Cientifico

josephdung

thoughts...

Tech_Raj

A great WordPress.com site

Travel tips

Travel tips

Experience the real life.....!!!

Shurwaat achi honi chahiye ...

Ronzii's Blog

Just your average geek's blog

Karan Jitendra Thakkar

Everything I think. Everything I do. Right here.

VentureBeat

News About Tech, Money and Innovation

Chetan Solanki

Helpful to u, if u need it.....

ScreenCrush

Explorer of Research #HEMBAD

managedCUDA

Explorer of Research #HEMBAD

siddheshsathe

A great WordPress.com site

Ari's

This is My Space so Dont Mess With IT !!

%d bloggers like this: