Close

Presentation

This content is available for: Tech Program Reg Pass. Upgrade Registration
5 ExaFlop/s HPL-MxP Benchmark with Linear Scalability on the 40-Million-Core Sunway Supercomputer
DescriptionHPL-MxP is an emerging high performance benchmark used to measure the mixed-precision computing capability of leading supercomputers. This work present our efforts on the new Sunway that linearly scales the benchmark to over 40 million cores, sustains an overall mixed-precision performance exceeding 5 ExaFlop/s, and achieves over 85% of peak performance, which is the highest efficiency reached among all heterogeneous systems on the HPL-MxP list. The optimizations of our HPL-MxP implementation include: (1)a Two-Direction Look-Ahead and Overlap algorithm that enables overlaps of all communications with computation; (2)a multi-level process-mapping and communication-scheduling method that uses the network as best as possible while maintaining conflict-free algorithm-flow; and (3)a CG-Fusion computing framework that eliminates up to 60% of inter-chip communications and removes the memory access bottleneck while serving both computation and communication simultaneously. This work could also provide useful insights for tuning cutting-edge applications on Sunway supercomputers as well as other heterogeneous supercomputers.
Event Type
Paper
TimeWednesday, 15 November 20232:30pm - 3pm MST
Location403-404
Tags
Exascale
Large Scale Systems
State of the Practice
Registration Categories
TP