Something More for Research

Explorer of Research #HEMBAD

GPU Parallel Programming in VS2012 with NVIDIA CUDA

Posted by Hemprasad Y. Badgujar on March 4, 2013


1. Introduction

Here I will share to you my first experience in creating a CUDA-based C++ program on Windows using Visual Studio 2012. CUDA is an acronym of Compute Unified Device Architecture, which is NVIDIA’s general purpose computing API for their graphics card hardware. This simple program is taken from the example code of NVIDIA’s samples, which is basically doing fill and copy operation with a big size matrix. Before continuing, you should have installed the required CUDA drivers, toolkits and SDK from here:
http://developer.nvidia.com/cuda/cuda-downloads

Or, if you’d rather choose to install the latest CUDA toolkit, head over here:
http://developer.nvidia.com/cuda/cuda-pre-production

You should also have a working C++ compiler. I am using Visual Studio 2012 on Windows 8 64-bit. Please be advised that CUDA-based applications won’t run unless the appropriate NVIDIA GPU hardware supporting CUDA is present in your system.

2. Setting up Visual Studio 2012

Basically everything should be set up automatically by the installer. However, with the current release of CUDA version 5.0, you might not be able to compile/build your project successfully. This is because nvcc.exe does not currently support the new cl.exe compiler version. If you try to compile any samples from the SDK there will be errors about target and props file not found or missing. For this, you should manually deploy those files according to the instructions from “C:\Program Files (x86)\NVIDIA GPU Computing Toolkit\CUDA\v5.0\extras\visual_studio_integration”

Those files still need some modifications for a successful compilation. You can download the modified files here: BuildCustomizations.rar. Extract the contents to the folder “C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V110\BuildCustomizations\”.
If you prefer to modify the files manually, follow these instructions carefully:

  1. Copy all the build customization files somewhere
  2. Open “CUDA 5.0.props”. Search for the following lines:
    Code:
    1
    2
    <CudaClVersion Condition="'$(PlatformToolset)' == 'v90'">2008</CudaClVersion>
    <CudaClVersion Condition="'$(PlatformToolset)' == 'v100'">2010</CudaClVersion>

    and add this new line:

    Code:
    1
    <CudaClVersion Condition="'$(PlatformToolset)' == 'v110'">2010</CudaClVersion>
  3. Open “CUDA 5.0.targets”. Search for the text “CudaCleanDependsOn” and replace the tag content with these lines:
    Code:
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    <CudaCleanDependsOn>
      $(CudaCompileDependsOn);
      _SelectedFiles;
      CudaFilterSelectedFiles;
      AddCudaCompileMetadata;
      AddCudaLinkMetadata;
      AddCudaCompileDeps;
      AddCudaCompilePropsDeps;
      ValidateCudaBuild;
      ValidateCudaCodeGeneration;
      ComputeCudaCompileOutput;
      PrepareForCudaBuild
    </CudaCleanDependsOn>
  4. In the same file, search for “GenerateRelocatableDeviceCode”. Replace the line with the following:
    Code:
    1
    GenerateRelocatableDeviceCode="%(CudaCompile.GenerateRelocatableDeviceCode)"
  5. Go down a bit and look for “CodeGeneration”. Replace the line with this:
    Code:
    1
    CodeGeneration="%(CudaCompile.CodeGenerationValues)"
  6. Again search for “CommandLineTemplate”. It should be somewhere near the end of the file. Replace the line with this:
    Code:
    1
    CommandLineTemplate=""$(CudaToolkitNvccPath)" %(CudaCompile.BuildCommandLineTemplate) %(CudaCompile.ApiCommandLineTemplate) %(CudaCompile.CleanCommandLineTemplate)" />
  7. Copy all modified files here: “C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V110\BuildCustomizations\”

Also, modify line 90 of the file “host_config.h” located in the folder:
“C:\Program Files (x86)\NVIDIA GPU Computing Toolkit\CUDA\v5.0\include\”
by changing the value ’1600′ to ’1700′.

Note: Remove the ‘x86′ inside paths if you use 64-bit CUDA toolkit

Syntax Highlighting

To have a fancy C++ syntax highlighting feature enabled, follow these steps:

  1. Select the menu “Tools->Options…”. Open “Text Editor” in the tree view on the left, and click on “File Extension”.
  2. Type “cu” in the “Extension” box, set the editor to “Microsoft Visual C++” and click “Add”. Click “OK” on the dialog box.
  3. Restart Visual Studio and your CUDA code should now have syntax highlighting.

3. Creating the App

Make sure you have installed all required SDKs. If everything is ok, then start by creating a simple console project and type this code:

Code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
#include <iostream>
using namespace std;
__global__ void saxpy(int n, float a, float *x, float *y)
{
  int i = blockIdx.x*blockDim.x + threadIdx.x;
  if (i < n) y[i] = a*x[i] + y[i];
}
int main(void)
{
  int N = 1<<20;
  float *x, *y, *d_x, *d_y;
  x = (float*)malloc(N*sizeof(float));
  y = (float*)malloc(N*sizeof(float));
  cudaMalloc(&d_x, N*sizeof(float));
  cudaMalloc(&d_y, N*sizeof(float));
  for (int i = 0; i < N; i++) {
    x[i] = 1.0f;
    y[i] = 2.0f;
  }
  cudaMemcpy(d_x, x, N*sizeof(float), cudaMemcpyHostToDevice);
  cudaMemcpy(d_y, y, N*sizeof(float), cudaMemcpyHostToDevice);
  // Perform SAXPY on 1M elements
  saxpy<<<(N+255)/256, 256>>>(N, 2.0, d_x, d_y);
  cudaMemcpy(y, d_y, N*sizeof(float), cudaMemcpyDeviceToHost);
  float maxError = 0.0f;
  for (int i = 0; i < N; i++)
    maxError = max(maxError, abs(y[i]-4.0f));
  cout << "Max error: " << maxError;
}

Before compiling, make a reference to the CUDA library by specifying its location and name in the project’s properties page:

  1. Navigate to the “Configuration Properties\Linker\General” option
  2. In the “Additional Library Directories” field, add “$(CUDA_PATH)lib\$(PlatformName)”
  3. Go to the “Configuration Properties\Linker\Input” option
  4. Lastly in the “Additional Dependencies” field, add “cudart.lib”

The code should compile successfully.

Read more: http://blog.norture.com/2012/10/gpu-parallel-programming-in-vs2012-with-nvidia-cuda/#ixzz2MVFioDQt

Advertisements

2 Responses to “GPU Parallel Programming in VS2012 with NVIDIA CUDA”

  1. Yet this good results aggrandized my confidence, in
    addition to We cash to be able to behind items to the abutting
    levels. Ladies system called this Anti-Martingale system
    the location where the gambler reduces the quantity of
    a bet after a loss.

  2. inch, and keep a person rallied in place in your slot machine game business.
    Simply the two participants towards quick eventually left of
    the seller usually are pressured for you to ante each and every palm

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
Extracts from a Personal Diary

dedicated to the life of a silent girl who eventually learnt to open up

Num3ri v 2.0

I miei numeri - seconda versione

ThuyDX

Just another WordPress.com site

Algunos Intereses de Abraham Zamudio Chauca

Matematica, Linux , Programacion Serial , Programacion Paralela (CPU - GPU) , Cluster de Computadores , Software Cientifico

josephdung

thoughts...

Tech_Raj

A great WordPress.com site

Travel tips

Travel tips

Experience the real life.....!!!

Shurwaat achi honi chahiye ...

Ronzii's Blog

Just your average geek's blog

Karan Jitendra Thakkar

Everything I think. Everything I do. Right here.

VentureBeat

News About Tech, Money and Innovation

Chetan Solanki

Helpful to u, if u need it.....

ScreenCrush

Explorer of Research #HEMBAD

managedCUDA

Explorer of Research #HEMBAD

siddheshsathe

A great WordPress.com site

Ari's

This is My Space so Dont Mess With IT !!

%d bloggers like this: