Close

Presentation

This content is available for: Workshop Reg Pass. Upgrade Registration
HPC Software Scaling for ML Using CXL 3.0 GFAM
DescriptionTraditional HPC systems rely on balanced soft scaling, which adjusts the compute-to-memory ratio according to the workload. However, this approach is challenged by Machine Learning applications, especially Large Language Model (LLM) workloads, which demand much more memory than compute. This leads to wasted compute resources and excessive data movement in the system. To address this issue, we propose to use CXL 3.0 Global Fabric Attached Memory (GFAM), which enables independent scaling of compute and memory and reduces data movement. In this talk, we will explore how GFAM architectures require changes in memory and compute placement, as well as software stacks, to optimize performance for LLM workloads.
Event Type
Workshop
TimeSunday, 12 November 20233:30pm - 4pm MST
Location505
Tags
Distributed Computing
Middleware and System Software
Runtime Systems
Registration Categories
W