Presentation

· Contributors · Organizations · Search Program · My Schedule · Happening Now · Maps

This content is available for: Workshop Reg Pass. Upgrade Registration

Specialized Kernels for Optimizing GPU Offload in OpenMP

SessionTenth Workshop on Accelerator Programming and Directives (WACCPD 2023)

DescriptionProgramming models for general purpose GPU (GPGPU) computing include grid and non-grid languages. Grid languages like CUDA and HIP map directly to the GPU hardware and can extract high performance from applications. However, this low-level programming approach makes them more difficult to program than non-grid languages such as C, C++, and Fortran with OpenMP target offload. Furthermore, grid languages often have more portability issues than non-grid languages. However, code generated from non-grid languages using automatic compiler and runtime techniques often incur higher overhead while generating GPU kernels.

This presentation discusses compiler and runtime techniques to generate specialized, high-performance kernels for OpenMP target regions in certain common situations. We outline conditions under which specialized kernels are generated for OpenMP target regions, both with and without reduction clauses. Experimental results on AMD GPUs indicate that a large percentage of OpenMP target regions are amenable to specialization and consequent performance improvement.

Author/Presenters

Dhruva Chakrabarti

Advanced Micro Devices (AMD) Inc

Gregory Rodgers

Advanced Micro Devices (AMD) Inc

Carlo Bertolli

Advanced Micro Devices (AMD) Inc

Gheorghe-Teodor Bercea

Advanced Micro Devices (AMD) Inc

Jan-Patrick Lehr

Advanced Micro Devices (AMD) Inc

Lynd Stringer