Close

Presentation

This content is available for: Workshop Reg Pass. Upgrade Registration
Performance Portability Evaluation of Blocked Stencil Computations on GPUs
DescriptionIn this new era where multiple GPU vendors are leading the supercomputing landscape, and multiple programming models are available to users, the drive to achieve performance portability across platforms faces new challenges. Consider stencil algorithms, where architecture-specific solutions are required to optimize for the parallelism hierarchy and memory hierarchy of emerging systems. In this work, we analyze performance portability of the BrickLib domain-specific library and vector code generator for stencils. BrickLib employs fine-grain data blocking to reduce the large amount of data movement associated with stencils. We compare different GPUs (NVIDIA, AMD and Intel) and their associated programming models (CUDA, HIP and SYCL). By testing a wide range of stencil configurations, we show that overall, BrickLib achieves good performance independent of machine or programming model. Moreover, we introduce correlation models as a new tool for comparing architectures and programming models from Roofline model data.
Event Type
Workshop
TimeMonday, 13 November 20239:41am - 10am MST
Location605
Tags
Performance Measurement, Modeling, and Tools
Performance Optimization
Registration Categories
W