PGI Compilers & Tools are used by scientists and engineers developing applications for high-performance computing (HPC).
Overview
The article discusses the latest release of PGI Compilers & Tools, version 19.1, which enhances support for high-performance computing applications. Key features include support for V100 Tensor Cores, full C++17 language support, and the addition of OpenACC printf() for debugging.
What You'll Learn
1
How to utilize V100 Tensor Cores in CUDA Fortran applications
2
Why full support for C++17 is beneficial for software development
3
How to use OpenACC printf() for debugging OpenACC code
4
When to apply PCAST directives for comparative debugging
5
How LLVM 7.0 improves performance on x86 and OpenPOWER CPUs
Key Questions Answered
What new features are included in PGI 19.1?
PGI 19.1 introduces several new features including support for V100 Tensor Cores in CUDA Fortran, full C++17 language support, the ability to use printf() statements in OpenACC for debugging, expanded PCAST directives for comparative debugging, and improved performance on x86 and OpenPOWER CPUs with LLVM 7.0.
How does PGI 19.1 enhance debugging capabilities?
The addition of OpenACC printf() allows developers to insert printf() statements directly into their OpenACC code, facilitating easier analysis and debugging of parallel code. This feature is particularly useful for identifying issues in GPU-accelerated applications.
Technologies & Tools
Programming Language
Cuda Fortran
Used for developing applications that leverage GPU acceleration.
Programming Language
C++17
Provides full support for modern C++ features in PGI 19.1.
Parallel Programming Model
Openacc
Facilitates GPU programming through directives.
Compiler Infrastructure
Llvm 7.0
Improves performance on x86 and OpenPOWER CPUs.
Key Actionable Insights
1Leverage the new V100 Tensor Core support to optimize performance in machine learning applications.By utilizing Tensor Cores, developers can significantly accelerate matrix operations, which are critical in deep learning workloads, leading to faster training times and improved model performance.
2Adopt C++17 features to enhance code maintainability and performance.C++17 introduces several improvements such as structured bindings and std::optional, which can lead to cleaner and more efficient code, making it easier to maintain and extend.
3Use OpenACC printf() for effective debugging of GPU code.Incorporating printf() statements allows developers to trace execution flow and variable states in parallel code, which is essential for diagnosing issues that arise in complex GPU computations.