Close

Presentation

This content is available for: Workshop Reg Pass. Upgrade Registration
Optimization of Ported CFD Kernels on Intel Data Center GPU Max 1550 Using oneAPI ESIMD
DescriptionWe describe our experience porting FUN3D's CUDA-optimized kernels to Intel oneAPI SYCL. We faced several challenges, including the suboptimal performance of the oneAPI code on Intel's new data center GPU. The suboptimal performance of the oneAPI code was due to high register spills, memory latency, and poor vectorization. We addressed these issues by implementing the kernels using Intel oneAPI's Explicit SIMD SYCL extension (ESIMD) API. The ESIMD API enables the writing of explicitly vectorized kernel code, gives more precise control over register usage and prefetching, and better handles thread divergence compared to SYCL. The ESIMD code outperforms the optimized SYCL code by up to a factor of 3.6, depending on the kernel. We also compared the performance with the CUDA-optimized version on NVIDIA V100 and A100 GPUs. We found the performance of a single tile of the Intel GPU using ESIMD greater than NVIDIA V100 and similar to NVIDIA A100.
Event Type
Workshop
TimeMonday, 13 November 20232:40pm - 3pm MST
Location710
Tags
Algorithms
Heterogeneous Computing
Large Scale Systems
Registration Categories
W