PGI 17.7 Delivers OpenACC and CUDA Fortran for Volta GPUs

PGI compilers & tools are used by scientists and engineers who develop applications for high-performance computing (HPC) systems. They deliver world-class…

Brad Nemire
1 min readintermediate
--
View Original

Overview

PGI 17.7 introduces enhanced support for Tesla V100 GPUs, OpenACC directives, and CUDA Fortran, aimed at improving performance in high-performance computing (HPC) applications. This release offers significant new features that enhance multicore CPU performance and GPU computing capabilities.

What You'll Learn

1

How to utilize OpenACC directives for GPU computing

2

Why performance portability is crucial in HPC applications

3

When to implement OpenMP 4.5 for multicore CPUs

Key Questions Answered

What new features are included in PGI 17.7?
PGI 17.7 includes support for Tesla V100 GPUs, OpenACC for CUDA Unified Memory, OpenMP 4.5 for multicore CPUs, C++14 lambdas with capture in OpenACC regions, and performance optimizations for C++. These features enhance the development of high-performance computing applications.
How does PGI 17.7 improve GPU computing?
PGI 17.7 enhances GPU computing by providing OpenACC directives that simplify the programming model for developers. It also supports CUDA Unified Memory, which allows for easier memory management between CPU and GPU, improving performance and usability for high-performance applications.

Technologies & Tools

Programming Model
Openacc
Used for simplifying GPU programming and improving performance portability.
Programming Language
Cuda Fortran
Enables developers to write Fortran code that can leverage NVIDIA GPUs.
Programming Model
Openmp 4.5
Provides support for parallel programming on multicore CPUs.
Hardware
Tesla V100
High-performance GPU supported by PGI 17.7 for enhanced computing capabilities.

Key Actionable Insights

1
Leverage OpenACC directives to simplify GPU programming in your HPC applications.
Using OpenACC can significantly reduce the complexity of coding for GPUs, allowing developers to focus on algorithm development rather than low-level GPU management.
2
Implement OpenMP 4.5 to optimize performance on multicore CPUs.
OpenMP 4.5 provides advanced features for parallel programming, which can help maximize the performance of applications running on multicore systems.
3
Utilize C++14 lambdas with capture in OpenACC regions for cleaner code.
This feature allows for more expressive and maintainable code when working with parallel regions, making it easier to manage data and state.