HPC tools classification
Gone are the days when there were only few parallel programming frameworks available like OpenMP, MPI and Threading Building Blocks. Now, with the advent of GPU computing and manycore architectures, there are a lot of High Performance Computing (HPC) languages and tools available which helps in speeding up our applications.
The HPC tools can be classified into 4 categories:
1. HPC Migration Language / HPC Language Extensions:
These HPC tools are basically extensions of existing languages such as C/C++. NVIDIA CUDA (Compute Unified Device Architecture) is one of them. For migration of an application using CUDA, the source code needs to be re-engineered; algorithm needs to be modified such that a large no. of GPU threads can be utilized to achieve the desired speed-up. OpenCL (Open Computing Language) is another such language extension. It is more low-level language than CUDA. The latest addition to this category is Microsoft C++ AMP.
2. Parallel Coding Assistant:
The tools in this category help us while coding in IDE like Microsoft Visual Studio. Intel Parallel Studio 2011 and Intel Parallel Studio XE 2011 can be integrated with Microsoft Visual Studio. They have been provided features which can help a programmer to analyze the code to find hotspots for adding parallelism, compose the source code by adding Intel Threading Building Blocks or Intel Cilk Plus constructs to exploit parallelism. They also have features to find memory errors or threading errors. The modified source code can be executed on any multi-core CPUs or Intel's Many Integrated Core (MIC) architectures.
3. Directives based Accelerator Models:
These are programming models which help a programmer to exploit parallelism by adding directives to potentially parallel portions of a sequential source code. The directives can be C pragma directives. PGI Accelerator Compiler from Portland Group Inc.(PGI) is based on such programming model. It also provides compiler feedback for the portions which could not be parallelized owing to dependency involved. For the computations to be performed on GPUs for getting desired speed up, the data needs to be copied to GPU device memory and then the result obtained has to be copied back to the CPU. This data transfer activity is taken care of by the PGI Accelerator compiler. The portion of the source code marked with parallel region gets executed directly on GPU device, thus accelerating the application. HMPP from CAPS Enterprise is similar Accelerator model. OpenACC is an upcoming accelerator model which is being supported by Cray, CAPS Enterprise, NVIDIA and PGI.
4. Library assisting in HPC migration:
There are a lot of library available which makes GPU programming easier. Lot of common algorithms like reduction, scan, etc. which are required in case of GPU programming are available which are optimized for execution on GPU devices. CUBLAS, CUFFT and CURAND are such libraries which can be used on CUDA platform. Thrust is a library of parallel algorithms with an interface resembling the C++ Standard Template Library (STL) that greatly enhances developer productivity. Libra SDK is a C++ programming API for creating high-performance applications. ArrayFire is a GPU software acceleration library.
The HPC tools mentioned above is not an exhaustive list. Through this blog, I have tried to classify the HPC tools and very briefly wrote about the tools. More details on HPC tools in my coming blogs.