Something More for Research

Explorer of Research #HEMBAD

Archive for the ‘Multimedia’ Category

Databases for Multi-camera , Network Camera , E-Surveillace

Posted by Hemprasad Y. Badgujar on February 18, 2016

Multi-view, Multi-Class Dataset: pedestrians, cars and buses

This dataset consists of 23 minutes and 57 seconds of synchronized frames taken at 25fps from 6 different calibrated DV cameras.
One camera was placed about 2m high of the ground, two others where located on a first floor high, and the rest on a second floor to cover an area of 22m x 22m.
The sequence was recorded at the EPFL university campus where there is a road with a bus stop, parking slots for cars and a pedestrian crossing.


Ground truth images
Ground truth annotations


The dataset on this page has been used for our multiview object pose estimation algorithm described in the following paper:

G. Roig, X. Boix, H. Ben Shitrit and P. Fua Conditional Random Fields for Multi-Camera Object Detection, ICCV11.

Multi-camera pedestrians video

“EPFL” data set: Multi-camera Pedestrian Videos

people tracking
results, please cite one of the references below.

On this page you can download a few multi-camera sequences that we acquired for developing and testing our people detection and tracking framework. All of the sequences feature several synchronised video streams filming the same area under different angles. All cameras are located about 2 meters from the ground. All pedestrians on the sequences are members of our laboratory, so there is no privacy issue. For the Basketball sequence, we received consent from the team.

Laboratory sequences

These sequences were shot inside our laboratory by 4 cameras. Four (respectively six) people are sequentially entering the room and walking around for 2 1/2 minutes. The frame rate is 25 fps and the videos are encoded using MPEG-4 codec.

[Camera 0] [Camera 1] [Camera 2] [Camera 3]

Calibration file for the 4 people indoor sequence.

[Camera 0] [Camera 1] [Camera 2] [Camera 3]

Calibration file for the 6 people indoor sequence.

Campus sequences

These two sequences called campus were shot outside on our campus with 3 DV cameras. Up to four people are simultaneously walking in front of them. The white line on the screenshots shows the limits of the area that we defined to obtain our tracking results. The frame rate is 25 fps and the videos are encoded using Indeo 5 codec.

[Seq.1, cam. 0] [Seq.1, cam. 1] [Seq.1, cam. 2]
[Seq.2, cam. 0] [Seq.2, cam. 1] [Seq.2, cam. 2]

Calibration file for the two above outdoor scenes.

Terrace sequences

The sequences below, called terrace, were shot outside our building on a terrace. Up to 7 people evolve in front of 4 DV cameras, for around 3 1/2 minutes. The frame rate is 25 fps and the videos are encoded using Indeo 5 codec.

[Seq.1, cam. 0] [Seq.1, cam. 1] [Seq.1, cam. 2] [Seq.1, cam. 3]
[Seq.2, cam. 0] [Seq.2, cam. 1] [Seq.2, cam. 2] [Seq.1, cam. 3]

Calibration file for the terrace scene.

Passageway sequence

This sequence dubbed passageway was filmed in an underground passageway to a train station. It was acquired with 4 DV cameras at 25 fps, and is encoded with Indeo 5. It is a rather difficult sequence due to the poor lighting.

[Seq.1, cam. 0] [Seq.1, cam. 1] [Seq.1, cam. 2] [Seq.1, cam. 3]

Calibration file for the passageway scene.

Basketball sequence

This sequence was filmed at a training session of a local basketball team. It was acquired with 4 DV cameras at 25 fps, and is encoded with Indeo 5.

[Seq.1, cam. 0] [Seq.1, cam. 1] [Seq.1, cam. 2] [Seq.1, cam. 3]

Calibration file for the basketball scene.

Camera calibration

POM only needs a simple calibration consisting of two homographies per camera view, which project the ground plane in top view to the ground plane in camera views and to the head plane in camera views (a plane parallel to the ground plane but located 1.75 m higher). Therefore, the calibration files given above consist of 2 homographies per camera. In degenerate cases where the camera is located inside the head plane, this one will project to a horizontal line in the camera image. When this happens, we do not provide a homography for the head plane, but instead we give the height of the line in which the head plane will project. This is expressed in percentage of the image height, starting from the top.

The homographies given in the calibration files project points in the camera views to their corresponding location on the top view of the ground plane, that is

H * X_image = X_topview .

We have also computed the camera calibration using the Tsai calibration toolkit for some of our sequences. We also make them available for download. They consist of an XML file per camera view, containing the standard Tsai calibration parameters. Note that the image size used for calibration might differ from the size of the video sequences. In this case, the image coordinates obtained with the calibration should be normalized to the size of the video.

Ground truth

We have created a ground truth data for some of the video sequences presented above, by locating and identifying the people in some frames at a regular interval.

To use these ground truth files, you must rely on the same calibration with the exact same parameters that we used when generating the data. We call top view the rectangular area of the ground plane in which we perform tracking.

This area is of dimensions tv_width x tv_height and has top left coordinate (tv_origin_x, tv_origin_y). Besides, we call grid our discretization of the top view area into grid_width x grid_height cells. An example is illustrated by the figure below, in which the grid has dimensions 5 x 4.

The people’s position in the ground truth are expressed in discrete grid coordinates. In order to be projected into the images with homographies or the Tsai calibration, these grid coordinates need to be translated into top view coordinates. We provide below a simple C function that performs this translation. This function takes the following parameters:

  • pos : the person position coming from the ground truth file
  • grid_width, grid_height : the grid dimension
  • tv_origin_x, tv_origin_y : the top left corner of the top view
  • tv_width, tv_height : the top view dimension
  • tv_x, tv_y : the top view coordinates, i.e. the output of the function
  void grid_to_tv(int pos, int grid_width, int grid_height,                  float tv_origin_x, float tv_origin_y, float tv_width,                  float tv_height, float &tv_x, float &tv_y) {     tv_x = ( (pos % grid_width) + 0.5 ) * (tv_width / grid_width) + tv_origin_x;    tv_y = ( (pos / grid_width) + 0.5 ) * (tv_height / grid_height) + tv_origin_y;  }

The table below summarizes the aforementionned parameters for the ground truth files we provide. Note that the ground truth for the terrace sequence has been generated with the Tsai calibration provided in the table. You will need to use this one to get a proper bounding box alignment.

Ground Truth Grid dimensions Top view origin Top view dimensions Calibration
6-people laboratory 56 x 56 (0 , 0) 358 x 360 file
terrace, seq. 1 30 x 44 (-500 , -1,500) 7,500 x 11,000 file (Tsai)
passageway, seq. 1 40 x 99 (0 , 38.48) 155 x 381 file

The format of the ground truth file is the following:

 1 <number of frames>  <number of people>  <grid width>  <grid height>  <step size>  <first frame>  <last frame> <pos> <pos> <pos> ... <pos> <pos> <pos> ... . . .

where <number of frames> is the total number of frames, <number of people> is the number of people for which we have produced a ground truth, <grid width> and <grid height>are the ground plane grid dimensions, <step size> is the frame interval between two ground truth labels (i.e. if set to 25, then there is a label once every 25 frames), and <first frame> and <last frame> are the first and last frames for which a label has been entered.

After the header, every line represents the positions of people at a given frame. <pos> is the position of a person in the grid. It is normally a integer >= 0, but can be -1 if undefined (i.e. no label has been produced for this frame) or -2 if the person is currently out of the grid.


Multiple Object Tracking using K-Shortest Paths Optimization

Jérôme Berclaz, François Fleuret, Engin Türetken, Pascal Fua
IEEE Transactions on Pattern Analysis and Machine Intelligence
pdf | show bibtex

Multi-Camera People Tracking with a Probabilistic Occupancy Map

François Fleuret, Jérôme Berclaz, Richard Lengagne, Pascal Fua
IEEE Transactions on Pattern Analysis and Machine Intelligence
pdf | show bibtex

MuHAVi: Multicamera Human Action Video Data

including selected action sequences with

MAS: Manually Annotated Silhouette Data

for the evaluation of human action recognition methods

Figure 1. The top view of the configuration of 8 cameras used to capture the actions in the blue action zone (which is marked with white tapes on the scene floor).

camera symbol

camera name

V1 Camera_1
V2 Camera_2
V3 Camera_3
V4 Camera_4
V5 Camera_5
V6 Camera_6
V7 Camera_7
V8 Camera_8

Table 1. Camera view names appearing in the MuHAVi data folders and the corresponding symbols used in Fig. 1.


On the table below, you can click on the links to download the data (JPG images) for the corresponding action

Important: We noted that some earlier versions of that earlier versions of MS Internet Explorer could not download files over 2GB size, so we recomment to use alternative browsers such as Firefox or Chrome.

Each tar file contains 7 folders corresponding to 7 actors (Person1 to Person7) each of which contains 8 folders corresponding to 8 cameras (Camera_1 to Camera_8). Image frames corresponding to every combination of action/actor/camera are named with image frame numbers starting from 00000001.jpg for simplicity. The video frame rate is 25 frames per second and the resolution of image frames (except for Camera_8) is 720 x 576 Pixels (columns x rows). The image resolution is 704 x 576 for Camera_8.

action class

action name

C1 WalkTurnBack 2.6GB
C2 RunStop 2.5GB
C3 Punch 3.0GB
C4 Kick 3.4GB
C5 ShotGunCollapse 4.3GB
C6 PullHeavyObject 4.5GB
C7 PickupThrowObject 3.0GB
C8 WalkFall 3.9GB
C9 LookInCar 4.6GB
C10 CrawlOnKnees 3.4GB
C11 WaveArms 2.2GB
C12 DrawGraffiti 2.7GB
C13 JumpOverFence 4.4GB
C14 DrunkWalk 4.0GB
C15 ClimbLadder 2.1GB
C16 SmashObject 3.3GB
C17 JumpOverGap 2.6GB

MIT Trajectory Data Set – Multiple Camera Views


MIT trajectory data set is for the research of activity analysis in multiple single camera view using the trajectories of objects as features. Object tracking is based on background subtraction using a Adaptive Gaussian Mixture model. There are totally four camera views. Trajectories in different camera views have been synchronized. The data can be downloaded from the following link,

MIT trajectory data set

Background image


Please cite as:

X. Wang, K. Tieu and E. Grimson, Correspondence‐Free Activity Analysis and Scene Modeling in Multiple Camera Views, IEEE Transactions on Pattern Analysis and Machine Intelligence(PAMI), Vol. 32, pp. 56-71, 2010..


MIT traffic data set is for research on activity analysis and crowded scenes. It includes a traffic video sequence of 90 minutes long. It is recorded by a stationary camera. The size of the scene is 720 by 480. It is divided into 20 clips and can be downloaded from the following links.

Ground Truth

In order to evaluate the performance of human detection on this data set, ground truth of pedestrians of some sampled frames are manually labeled. It can be downloaded below. A readme file provides the instructions of how to use it.
Ground truth of pedestrians


  1. Unsupervised Activity Perception in Crowded and Complicated scenes Using Hierarchical Bayesian Models
    X. Wang, X. Ma and E. Grimson
    IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), Vol. 31, pp. 539-555, 2009
  2. Automatic Adaptation of a Generic Pedestrian Detector to a Specific Traffic Scene
    M. Wang and X. Wang
    IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 2011


This dataset is presented in our CVPR 2015 paper,
Linjie Yang, Ping Luo, Chen Change Loy, Xiaoou Tang. A Large-Scale Car Dataset for Fine-Grained Categorization and Verification, In Computer Vision and Pattern Recognition (CVPR), 2015. PDF

The Comprehensive Cars (CompCars) dataset contains data from two scenarios, including images from web-nature and surveillance-nature. The web-nature data contains 163 car makes with 1,716 car models. There are a total of 136,726 images capturing the entire cars and 27,618 images capturing the car parts. The full car images are labeled with bounding boxes and viewpoints. Each car model is labeled with five attributes, including maximum speed, displacement, number of doors, number of seats, and type of car. The surveillance-nature data contains 50,000 car images captured in the front view. Please refer to our paper for the details.

The dataset is well prepared for the following computer vision tasks:

  • Fine-grained classification
  • Attribute prediction
  • Car model verification

The train/test subsets of these tasks introduced in our paper are included in the dataset. Researchers are also welcome to utilize it for any other tasks such as image ranking, multi-task learning, and 3D reconstruction.


  1. You need to complete the release agreement form to download the dataset. Please see below.
  2. The CompCars database is available for non-commercial research purposes only.
  3. All images of the CompCars database are obtained from the Internet which are not property of MMLAB, The Chinese University of Hong Kong. The MMLAB is not responsible for the content nor the meaning of these images.
  4. You agree not to reproduce, duplicate, copy, sell, trade, resell or exploit for any commercial purposes, any portion of the images and any portion of derived data.
  5. You agree not to further copy, publish or distribute any portion of the CompCars database. Except, for internal use at a single site within the same organization it is allowed to make copies of the database.
  6. The MMLAB reserves the right to terminate your access to the database at any time.
  7. All submitted papers or any publicly available text using the CompCars database must cite the following paper:
    Linjie Yang, Ping Luo, Chen Change Loy, Xiaoou Tang. A Large-Scale Car Dataset for Fine-Grained Categorization and Verification, In Computer Vision and Pattern Recognition (CVPR), 2015.

Download instructions

Download the CompCars dataset Release Agreement, read it carefully, and complete it appropriately. Note that the agreement should be signed by a full-time staff member (that is, student is not acceptable). Then, please scan the signed agreement and send it to Mr. Linjie Yang (yl012(at) and cc to Chen Change Loy (ccloy(at) We will verify your request and contact you on how to download the database.

Stanford Cars Dataset


       The Cars dataset contains 16,185 images of 196 classes of cars. The data is split into 8,144 training images and 8,041 testing images, where each class has been split roughly in a 50-50 split. Classes are typically at the level of Make, Model, Year, e.g. 2012 Tesla Model S or 2012 BMW M3 coupe.


       Training images can be downloaded here.
Testing images can be downloaded here.
A devkit, including class labels for training images and bounding boxes for all images, can be downloaded here.
If you’re interested in the BMW-10 dataset, you can get that here.

Update: For ease of development, a tar of all images is available here and all bounding boxes and labels for both training and test are available here. If you were using the evaluation server before (which is still running), you can use test annotations here to evaluate yourself without using the server.


       An evaluation server has been set up here. Instructions for the submission format are included in the devkit. This dataset was featured as part of FGComp 2013, and competition results are directly comparable to results obtained from evaluating on images here.


       If you use this dataset, please cite the following paper:

3D Object Representations for Fine-Grained Categorization
Jonathan Krause, Michael Stark, Jia Deng, Li Fei-Fei
4th IEEE Workshop on 3D Representation and Recognition, at ICCV 2013 (3dRR-13). Sydney, Australia. Dec. 8, 2013.
[pdf]   [BibTex]   [slides]

Note that the dataset, as released, has 196 categories, one less than in the paper, as it has been cleaned up slightly since publication. Numbers should be more or less comparable, though.

The HDA dataset is a multi-camera high-resolution image sequence dataset for research on high-definition surveillance. 18 cameras (including VGA, HD and Full HD resolution) were recorded simultaneously during 30 minutes in a typical indoor office scenario at a busy hour (lunch time) involving more than 80 persons. In the current release (v1.1), 13 cameras have been fully labeled.


The venue spans three floors of the Institute for Systems and Robotics (ISR-Lisbon) facilities. The following pictures show the placement of the cameras. The 18 recorded cameras are identified with a small red circle. The 13 cameras with a coloured view field have been fully labeled in the current release (v1.1).


Each frame is labeled with the bounding boxes tightly adjusted to the visible body of the persons, the unique identification of each person, and flag bits indicating occlusion and crowd:

  • The bounding box is drawn so that it completely and tightly encloses the person.
  • If the person is occluded by something (except image boundaries), the bounding box is drawn by estimating the whole body extent.
  • People partially outside the image boundaries have their BB’s cropped to image limits. Partially occluded people and people partially outside the image boundaries are marked as ‘occluded’.
  • A unique ID is associated to each person, e.g., ‘person01’. In case of identity doubt, the special ID ‘personUnk’ is used.
  • Groups of people that are impossible to label individually are labelled collectively as ‘crowd’. People in front of a ’crowd’ area are labeled normally.

The following figures show examples of labeled frames: (a) an unoccluded person; (b) two occluded people; (c) a crowd with three people in front.


Data formats:

For each camera we provide the .jpg frames sequentially numbered and a .txt file containing the annotations according to the “video bounding box” (vbb) format defined in the Caltech Pedestrian Detection Database. Also on this site there are tools to visualise the annotations overlapped on the image frames.


Some statistics:

Labeled Sequences: 13

Number of Frames: 75207

Number of Bounding Boxes: 64028

Number of Persons: 85


Repository of Results:

We maintain a public repository of re-identification results in this dataset. Send us your CMC curve to be uploaded  (alex at isr ist utl pt).
Click here to see the full list and detailed experiments.

MANUAL_c_l_e_a_n cam60

Posted in Computer Network & Security, Computer Research, Computer Vision, Image Processing, Multimedia | Leave a Comment »

How to Build a GPU-Accelerated Cluster

Posted by Hemprasad Y. Badgujar on December 22, 2014

Some of the fastest computers in the world are cluster computers. A cluster is a computer system comprising two or more computers (“nodes”) connected with a high-speed network. Cluster computers can achieve higher availability, reliability, and scalability than is possible with an individual computer. With the increasing adoption of GPUs in high performance computing (HPC), NVIDIA GPUs are becoming part of some of the world’s most powerful supercomputers and clusters. The most recent top 500 list of the worlds fastest supercomputers included nearly 50 supercomputers powered by NVIDIA GPUs, and the current world’s fastest supercomputer, Oak Ridge National Labs TITAN, utilizes more than 18,000 NVIDIA Kepler GPUs.

In this post I will take you step by step through the process of designing, deploying, and managing a small research prototype GPU cluster for HPC. I will describe all the components needed for a GPU cluster as well as the complete cluster management software stack. The goal is to build a research prototype GPU cluster using all open source and free software and with minimal hardware cost.

I gave a talk on this topic at GTC 2013 (session S3516 – Building Your Own GPU Research Cluster Using Open Source Software Stack). The slides and a recording are available at that link so please check it out!

There are multiple motivating reason for building a GPU-based research cluster.

  • Get a feel for production systems and performance estimates;
  • Port your applications to GPUs and distributed computing (using CUDA-aware MPI);
  • Tune GPU and CPU load balancing for your application;
  • Use the cluster as development platform;
  • Early experience means increased readiness;
  • The investment is relatively small for a research prototype cluster

Figure 1 shows the steps to build a small GPU cluster. Let’s look at the process in more detail.

Steps in building GPU based Clusters
Figure 1: Seven steps to build and test a small research GPU cluster.

1. Choose Your Hardware

There are two steps to choosing the correct hardware.

  1. Node Hardware Details. This isthe specification of the machine (node) for your cluster. Each node has the  following components.
    • CPU processor from any vendor;
    • A motherboard with the following PCI-express connections:
      • 2x PCIe x16 Gen2/3 connections for Tesla GPUs;
      • 1x PCIe x8 wide for HCI Infiniband card;
    • 2 available network ports;
    • A minimum of 16-24 GB DDR3 RAM. (It is good to have more RAM in the system).
    • A power-supply unit (SMPS) with ample power rating. The total power supply needed includes power taken by the CPU, GPUs and other components in the system.
    • Secondary storage (HDD / SSD) based on your needs.

    GPU boards are wide enough to cover two physically adjacent PCI-e slots, so make sure that the PCIe x16 and x8 slots are physically separated on the motherboard so that you can fit a minimum of 2 PCI-e x16 GPUs and 1 PCIe x8 network card.

  2. Choose the right form factor forGPUs. Once you decide your machine specs you should also decide which modelGPUs you would like to consider for your system. The form factor ofGPUs is an important consideration. Kepler-based NVIDIA TeslaGPUs are available in two main form factors.
    • Tesla workstation products (C Series) are actively cooled GPU boards (this means they have a fan cooler over the GPU chip) that you can just plug in to your desktop computer in a PCI-e x16 slot. These use either two 6-pin or one 8-pin power supply connector.
    • Server products (M Series) are passively cooled GPUs (no fans) installed in standard servers sold by various OEMs.

    There are three different options for adding GPUs to your cluster:

    • you can buy C-series GPUs and install them in existing workstations or servers with enough space;
    • you can buy workstations from a vendor with C-series GPUs installed; or
    • you can buy servers with M-series GPUs installed.

2. Allocate Space, Power and Cooling

The goal for this step is to assess your physical infrastructure, including space, power and cooling needs, network considerations and storage requirements to ensure optimal system choices with room to grow your cluster in the future. You should make sure that you have enough space, power and cooling for your cluster. Clusters are mainly rack mounted, with multiple machines installed in a vertical rack. Vendors offer many server solutions that minimize the use of rack space.

3. Assembly and Physical Deployment

After deciding the machine configuration and real estate the next step is to physically deploy your cluster. Figure 2 shows the cluster deployment connections. The head node is the external interface to the cluster; it receives all external network connections, processes incoming requests, and assigns work to compute nodes (nodes with GPUs that perform the computation).

In a research prototype cluster you can also make use one of the compute nodes as a head node, but routing all traffic from the head node and also making it a compute node is not a good idea for production clusters because of performance and security issues. Production and large clusters mostly have a dedicated node to handle all incoming traffic while the head node just manages the work distribution for the compute nodes.

Head Node & Compute Nodes connections
Figure 2: Head node and compute node connections.

4. Head Node Installation

I recommend installing the head node with the open source Rocks Linux distribution. Rocks is a customizable, easy and quick way to install nodes. The Rocks installation package includes essential components for clusters, such as MPI. ROCKS head node installation is well-documented in the Rocks user guide, but here is a summary of the steps.

  • Follow the steps in Chapter 3 of the Rocks user guide and do a CD-based installation.
  • Install the NVIDIA drivers and CUDA Toolkit on the head node. (CUDA 5 provides a unified package that contain NVIDIA driver, toolkit and CUDA Samples.) 
  • Install network interconnect drivers (e.g. Infiniband) on the head node. These drivers are available from your interconnect manufacturer.
  • Nagios® Core™ is an open source system and network monitoring application. It watches hosts and services that you specify, alerting you when things go wrong and when they get better. To install, follow the instructions given in the Nagios installation guide.
  • The NRPE Nagios add-on allows you to execute Nagios plugins on remote Linux machines. This allows you to monitor local resources like CPU load and memory usage, which are not usually exposed to external machines, on remote machines using Nagios. Install NRPE following the install guide.

5. Compute Node Installation

After you have completed the head node installation, you will install the compute node software with the help of Rocks and the following steps.

  • On the head node: in a terminal shell run the command:
    > insert-ethers

    Choose “Compute Nodes” as the new node to add.

  • Power on the compute node with the Rocks CD as the first boot device or do a network installation.
  • The compute node will connect to the head node and start the installation.
  • Install the NRPE package as described in the NRPE guide.

6. Management and Monitoring

Once you finish the head node and all compute node installations, your cluster is ready to use! Before you actually start using it to run applications of interest, you should also set up management and monitoring tools on the cluster. These tools are necessary for proper management and monitoring of all resources available in cluster. In this section, I will describe various tools and software packages for GPU management and monitoring.


The NVIDIA System Management Interface (NVIDIA-SMI) is a tool distributed as part of the NVIDIA GPU driver. NVIDIA-SMI provides a variety of GPU system information including

  • thermal monitoring metrics: GPU temperature, chassis inlet/outlet temperatures;
  • system Information: firmware revision, configuration information;
  • system state: fan states, GPU faults, power system fault; ECC errors, etc.

NVIDIA-SMI allows you to configure the compute mode for any device in the system (Reference: CUDA C Programming Guide)

  • Default compute mode: multiple host threads can use the device at the same time.
  • Exclusive-process compute mode: Only one CUDA context may be created on the device across all processes in the system and that context may be current to as many threads as desired within the process that created the context.
  • Exclusive-process-and-thread compute mode: Only one CUDA context may be created on the device across all processes in the system and that context may only be current to one thread at a time.
  • Prohibited compute mode: No CUDA context can be created on the device.

NVIDIA-SMI also allows you to turn ECC (Error Correcting Code memory) mode on and off. The default is ON, but applications that do not need ECC can get higher memory bandwidth by disabling it.


The Tesla Deployment Kit is a collection of tools provided to better manage NVIDIA Tesla™ GPUs. These tools support Linux (32-bit and 64-bit), Windows 7 (64-bit), and Windows Server 2008 R2 (64-bit). The current distribution contains NVIDIA-healthmon and the NVML API.


The NVML API is a C-based API which provides programmatic state monitoring and management of NVIDIA GPU devices. The NVML dynamic run-time library ships with the NVIDIA display driver, and the NVML SDK provides headers, stub libraries and sample applications. NVML can be used from Python or Perl (bindings are available) as well as C/C++ or Fortran.

Ganglia is an open-source scalable distributed monitoring system used for clusters and grids with very low per-node overhead and high concurrency. Ganglia gmond is an NVML-based Python module for monitoring NVIDIA GPUs in the Ganglia interface.


This utility provides quick health checking of GPUs in cluster nodes. The tool detects issues and suggests remedies to software and system configuration problems, but it is not a comprehensive hardware diagnostic tool. Features include:

  • basic CUDA and NVML sanity check;
  • diagnosis of GPU failures;
  • check for conflicting drivers;
  • poorly seated GPU detection;
  • check for disconnected power cables;
  • ECC error detection and reporting;
  • bandwidth test;
  • infoROM validation.

7. Run Benchmarks and Applications

Once your cluster is up and running you will want to validate it by running some benchmarks and sample applications. There are various benchmarks and code samples for GPUs and the network as well as applications to run on the entire cluster. For GPUs, you need to run two basic tests.

  1. devicequery: This sample code is available with the CUDA Samples included in the CUDA Toolkit installation package. devicequery simply enumerates the properties of the CUDA devices present in a node. This is not a benchmark but successfully running this or any other CUDA sample serves to verify that you have the CUDA driver and toolkit properly installed on the system.
  2. bandwidthtest: This is another of the CUDA Samples included with the Toolkit. This sample measures the cudaMemcopy bandwidth of the GPU across PCI-e as well as internally. You should measure device-to-device copy bandwidth, host-to-device copy bandwidth for pageable and page-locked memory, and device-to-host copy bandwidth for pageable and page-locked memory.

To benchmark network performance, you should run the bandwidth and latency tests for your installed MPI distribution. MPI standard installations have standard benchmarks such as /tests/osu_benchmarks-3.1.1. You should consider using an open source CUDA-aware MPI implementation like MVAPICH2, as described in earlier Parallel Forall posts An Introduction to CUDA-Aware MPI and Benchmarking CUDA-Aware MPI.

To benchmark the entire cluster, you should run the LINPACK numerical linear algebra application. The top 500 supercomputers list uses the HPL benchmark to decide the fastest supercomputers on Earth. The CUDA-enabled version of HPL (High-Performance LINPACK) optimized for GPUs is available from NVIDIA on request, and there is a Fermi-optimized version available to all NVIDIA registered developers.

# In this post I have provided an overview of the basic steps to build a GPU-accelerated research prototype cluster. For more details on GPU-based clusters and some of best practices for production clusters, please refer to Dale Southard’s GTC 2013 talk S3249 – Introduction to Deploying, Managing, and Using GPU Clusters by Dale Southard.

Posted in CLOUD, CLUSTER, Computer Vision, Computing Technology, CUDA, GPU (CUDA), GRID, Linux OS, Mixed, Multimedia, PARALLEL | Tagged: , , | Leave a Comment »

Audio compression formats compared

Posted by Hemprasad Y. Badgujar on May 15, 2013

Although MP3 is the most popular format for compressing digital audio, there are literally dozens of other formats from which to choose, including AAC, Windows Media Audio (WMA), Ogg Vorbis and MPC, to name a few. The reasons for using a format other than MP3 would depend upon your requirements. For example, you may want a format that is extremely high quality, in which case you might choose MPC. If you were after a good quality format for streaming audio over a modem, WMA would probably be the best choice.


The audio codecs (codec stands for “encoder-decoder”) discussed below all belong to a class of compression called ‘lossy’. Effectively, this means that in order to achieve such high levels of compression, and consequently such small files, audio information is discarded. Lossy compression is to audio what JPEG compression is to images. By sacrificing a little bit of quality, much space can be saved in the resulting file size. The success of a lossy codec is based on how well it discards audio information considered to be imperceptible and therefore unnecessary. Some newer audio compression formats — such as AAC and WMA — do a much better job of this than the now ageing MP3 algorithm. Below is a summary of the major digital audio compression formats available and a comparison of how they rate out of 10*.








Played by almost every portable digital audio device and many DVD players, MP3 is still hard to go past if you’re looking for maximum compatibility for your files. Whilst you can get much better compression from other formats, hard disks and blank CDs are cheap enough to justify the extra file size. Stereo imaging is not terrific and encoding quality differs from one software package to another.


Compression: 5.


Quality: 7.


Compatibility: 10.


Overall: 7.5.








This format is interesting as it combines a low bitrate^ MP3 file with what is called spectral band replication (SBR) data. The SBR component of the file supplies quality high frequencies, while the MP3 part of the file produces quality low frequencies. The combination of the two ensures very small file sizes. Interestingly, the MP3 component of MP3PRO files is backwardly compatible with all MP3 players, making it a tempting choice for general use.


Compression: 6.


Quality: 7.


Compatibility: 8.


Overall: 7.







This format is a joint project between Fraunhofer (the people responsible for MP3), AT&T, Lucent, Sony and Dolby. MP3 was part of the MPEG-1 video compression specification, but AAC belongs to the MPEG-2 specification. Generally speaking, AAC files are better quality and around 30 per cent smaller than the MP3 equivalent. Some portable devices will play this format but, generally speaking, it is not in common use.


Compression: 7.


Quality: 9.


Compatibility: 6.


Overall: 7.








Window’s Media Audio is Microsoft’s contribution to high quality, lossy audio compression. Like most other new formats, it outperforms MP3 in terms of quality and compression, particularly at lower bitrates. Consequently, WMA is probably the format of choice for streaming at low bandwidths. Like MP3, however, the stereo imaging is not very accurate. Additionally, WMA tends to overcompensate for its high compression with what is often called ‘overbrightness‘.


Compression: 8.


Quality: 7.


Compatibility: 9.


Overall: 8.








This format, developed by Dolby, is often used for video soundtracks due to its ability to handle surround sound formats such as 5.1 channel information. It was designed for use in consumer electronics such as high definition TV, cable TV and satellite broadcasts. Some DVD/MP3 players support AC3 playback, although it is not widely used as a stand-alone audio format. One of the top features of AC3 is that it provides excellent stereo imaging — an area where most other lossy codecs fail.


Compression: 6.


Quality: 8.


Compatibility: 5.


Overall: 6.5.








Also known as MPEGplus, this is a much better MPEG-1 audio format than MP3, although it can only be used at high bitrates because it is designed for very high quality applications. The encoder is currently free but will become shareware. While not widely supported in general, there is a free decoder plug-in for Winamp. If quality is your main concern and file sharing isn’t on the agenda, this may be the format to choose.


Compression: 7.


Quality: 9.


Compatibility: 5.


Overall: 7.






Ogg Vorbis is a project attempting to replace all proprietary audio formats with an open standard freeware codec. Version one was released in this past fortnight and has been demonstrated to be very high quality and outperforms MP3 by a long shot. At low bitrates it doesn’t compete with WMA, and at high bitrates it falls short of MPC. Given that it is a work in progress, however, it has strong potential to become a widely used audio codec. Some portable device manufacturers are promising to support Ogg Vorbis in future software releases.


Compression: 8.


Quality: 7.


Compatibility: 6.


Overall: 7.








Real Audio is something of a dinosaur in the digital audio world. The player is free, but wastes a lot of system resources as well as being advertising intensive. That said, the recent version 8.5 release supports CD quality bitrates as well as good quality results at lower bitrates. On the downside, the codec is not integrated as a standard at an operating system level and is quite CPU intensive. To its credit, though, Real Audio encoding is exceptionally quick.


Compression: 6.


Quality: 7.


Compatibility: 9.


Overall: 7.








Quality (estimate) Format (compression type) Bitrate (Kbps) Filesize (KB/min)
CD Quality Uncompressed WAV 1411 105,000
MP3 128 960
MP3 (VBR) 112 840
RA 96 720
WMA 92 690
OGG 112 840
MPC 88 660
AAC 80 600
AC3 967 720
MP3PRO 80 600
FM Radio MP3 96 720
MP3 (VBR) 80 600
RA 64 480
WMA 56 420
OGG 67 500
MPC 64 480
AAC 56 420
AC3 64 480
MP3PRO 56 420
AM Radio MP3 64 480
MP3 (VBR) 40 300
RA 32 240
WMA 20 150
OGG 32 240
MPC 28 210
AAC 20 150
AC3 24 180
MP3PRO 22 165




Disclaimer: the above guidelines are suggested estimates only. Similar quality audio output will, in most cases, occur at the specified settings although actual values will differ according to the nature of the audio recording, the encoding software, and the playback equipment used.


* The Compression/Quality/Compatibility/Overall ratings are purely the author’s opinion. There is no definitive way of proving any of it, as it comes down to the subjective experience of the listener. In this case, Daniel Potts has based his opinion on a mix of his personal preference as well as many reviews and codec comparisons by others – Ed.^ The higher the bitrate, the lower the compression. The lower the compression, the better the quality and the larger the file size.


Got a digital music question? Ask HelpScreen


English: Manowar The Power of Thy Sword (1992)...

English: Manowar The Power of Thy Sword (1992), frequencies of the uncompressed, 128 aoTuV, mp3 and ATRAC3 Files. Deutsch: Frequenzspektrum von Manowars ‘The Power of Thy Sword’ (1992) des Albums ‘The Triumph of Steel’, Unkomprimiert, 128er aotuV, mp3, ATRAC 3 @ 292 und 132kBit/s, Musepack ~128, LC-AAC und HE-AAC ~128 und wma auf 128kbit/s (Photo credit: Wikipedia)


Posted in Animation, Audio, Entertainment, Multimedia, Songs, Text, Video | 1 Comment »

Extracts from a Personal Diary

dedicated to the life of a silent girl who eventually learnt to open up

Num3ri v 2.0

I miei numeri - seconda versione


Just another site

Algunos Intereses de Abraham Zamudio Chauca

Matematica, Linux , Programacion Serial , Programacion Paralela (CPU - GPU) , Cluster de Computadores , Software Cientifico




A great site

Travel tips

Travel tips

Experience the real life.....!!!

Shurwaat achi honi chahiye ...

Ronzii's Blog

Just your average geek's blog

Karan Jitendra Thakkar

Everything I think. Everything I do. Right here.


News About Tech, Money and Innovation

Chetan Solanki

Helpful to u, if u need it.....


Explorer of Research #HEMBAD


Explorer of Research #HEMBAD


A great site


This is My Space so Dont Mess With IT !!

%d bloggers like this: