Close

Presentation

This content is available for: Workshop Reg Pass. Upgrade Registration
MPI-xCCL: A Portable MPI Library over Collective Communication Libraries for Various Accelerators
DescriptionThe evolution of high-performance computing toward diverse accelerators, including NVIDIA, AMD, Intel GPUs, and Habana Gaudi Accelerators, demands a user-friendly and efficient utilization of these technologies. While both GPU-aware MPI libraries and vendor-specific communication libraries cater to communication requirements, trade-offs emerge based on library selection across various message sizes. Thus, prioritizing usability, we propose MPI-xCCL, a Message Passing Interface-based runtime with cross-accelerator support for efficient, portable, scalable, and optimized communication performance. MPI-xCCL incorporates vendor-specific libraries with GPU-aware MPI runtimes ensuring multi-accelerator compatibility while adhering to MPI standards. The proposed hybrid designs leverage the benefits of MPI and xCCL algorithms and transparently to the end user. We evaluated our designs on various HPC systems using OSU Micro-Benchmarks, and Deep Learning frameworks TensorFlow with Horovod. On NVIDIA-GPU-enabled ThetaGPU, our designs outperformed Open MPI by 4.6x. On emerging Habana Gaudi-based systems, MPI-xCCL was also able to deliver similar performance as vendor-provided communication runtimes.
Event Type
Workshop
TimeSunday, 12 November 20234:50pm - 5:10pm MST
Location505
Tags
Distributed Computing
Middleware and System Software
Runtime Systems
Registration Categories
W