New programming languages and models

From HP-SEE Wiki

(Difference between revisions)
Jump to: navigation, search
(OpenACC)
(CAPS HMPP)
Line 1: Line 1:
== CAPS HMPP ==
== CAPS HMPP ==
 +
 +
CAPS HMPP is a complete software solution for the parallelization of legacy codes targeting in the efficient usage of multicore and GPGPU resources. Much like OpenMP HMPP is used as a set of directives which can be used to speed up the execution of a code while preserving its portability across infrastructures. HMPP can be used on already parallelized codes (either with MPI or OpenMP based ones).
 +
 +
HMPP directives that are added in the application source code do not change the semantic of the original code. They address the remote execution of functions or regions of code on GPUs and many-core accelerators as well as the transfer of data to and from the target device memory.
 +
 +
In the procedure of implementing HMPP directives on top of an existing code there are 3 steps to consider:
 +
  * declaration of kernels that withhold a critical amount of computation
 +
  * data management to and from the target device (i.e. GPGPU) memory
 +
  * optimization of kernel performance and data synchronization
 +
 +
In practice HMPP separately handles and compiles an application to be executed on the native host and seperately the GPU accelerated codelet functions that are implemented on top of the native application as software plugins. In effect this means that the resulting application can be executed on a host that either has or does not have an accelerator resource (i.e. a GPGPU) as codelets will be executed on such hardware only as long as they are to be found on the host.
 +
 +
Note that the codelets are translated in NVIDIA CUDA and OpenCL languages by the HMPP backend and are thus compiled with the existing tools for these extensions on the software stack.
== OpenACC ==
== OpenACC ==

Revision as of 17:00, 1 October 2012

CAPS HMPP

CAPS HMPP is a complete software solution for the parallelization of legacy codes targeting in the efficient usage of multicore and GPGPU resources. Much like OpenMP HMPP is used as a set of directives which can be used to speed up the execution of a code while preserving its portability across infrastructures. HMPP can be used on already parallelized codes (either with MPI or OpenMP based ones).

HMPP directives that are added in the application source code do not change the semantic of the original code. They address the remote execution of functions or regions of code on GPUs and many-core accelerators as well as the transfer of data to and from the target device memory.

In the procedure of implementing HMPP directives on top of an existing code there are 3 steps to consider:

  * declaration of kernels that withhold a critical amount of computation
  * data management to and from the target device (i.e. GPGPU) memory
  * optimization of kernel performance and data synchronization

In practice HMPP separately handles and compiles an application to be executed on the native host and seperately the GPU accelerated codelet functions that are implemented on top of the native application as software plugins. In effect this means that the resulting application can be executed on a host that either has or does not have an accelerator resource (i.e. a GPGPU) as codelets will be executed on such hardware only as long as they are to be found on the host.

Note that the codelets are translated in NVIDIA CUDA and OpenCL languages by the HMPP backend and are thus compiled with the existing tools for these extensions on the software stack.

OpenACC

OpenACC is compirsed of a set of standardized, high-level pragmas that enable C/C++ and Fortran programmers to extend their code onto utilizing massively parallel processors with much of the convenience of OpenMP. Thus, the OpenACC standard preserves the familiarity of OpenMP code annotation while extending the execution model to encompass devices that reside in separate memory spaces. To support coprocessors, OpenACC pragmas annotate data placement and transfer (in addition to OpenMP) as well as loop and block parallelism.

Note that, in similarity to OpenMP, OpenACC provides portability across operating systems, host CPUs and accelerator resources.

The details of data management and parallelism are implicit in the programming model and are managed by OpenACC API-enabled compilers and runtimes. The programming model allows thus the programmer to handle in a clear way data management and guidance on mapping of loops onto an accelerator as well as similar performance-related details.

OpenACC directives may in-as-far be used via the CAPS HMPP product and via latest versions of the PGI Compiler Suite.

MPI-ACC

Personal tools