Close

Presentation

This content is available for: Tech Program Reg Pass, Exhibits Reg Pass. Upgrade Registration
That's Right – The Same C++ STL Asynchronous Parallel Code Runs on CPUs and GPUs
DescriptionHigh-performance computing applications running on modern-day supercomputers frequently encounter performance and portability challenges especially if using multiple programming models, languages and compilers. In this work, we explore the proposed C++26 language standard model for asynchronous parallelism, called std::execution or stdexec, powered with stdpar, std::mdspan, among other C++23 features, to port and analyze multiple scientific HPC applications on CPUs and GPUs. These applications include sequence alignment codes from ADEPT and heat transfer from AMReX. Our experiments depict near-native performance for our ported implementations on NVIDIA A100 GPUs running on the Perlmutter supercomputer. We also study and analyze the data transfer traffic patterns and overheads between the host and device for stdpar and provide helpful insights in application performance. Finally, we discuss some challenges and limitations encountered while porting these apps to C++26 with stdexec, as well as their workarounds, until the stdexec is fully integrated and function in the NVHPC compilers.
Event Type
Posters
Research Posters
TimeTuesday, 14 November 202310am - 5pm MST
Registration Categories
TP
XO/EX