Presentation

· Contributors · Organizations · Search Program · My Schedule · Happening Now · Maps

This content is available for: Workshop Reg Pass. Upgrade Registration

HPC Software Scaling for ML Using CXL 3.0 GFAM

Session6th International Workshop on Emerging Parallel Distributed Runtime Systems and Middleware

DescriptionTraditional HPC systems rely on balanced soft scaling, which adjusts the compute-to-memory ratio according to the workload. However, this approach is challenged by Machine Learning applications, especially Large Language Model (LLM) workloads, which demand much more memory than compute. This leads to wasted compute resources and excessive data movement in the system. To address this issue, we propose to use CXL 3.0 Global Fabric Attached Memory (GFAM), which enables independent scaling of compute and memory and reduces data movement. In this talk, we will explore how GFAM architectures require changes in memory and compute placement, as well as software stacks, to optimize performance for LLM workloads.

Presenter

Patrick Estep

Micron Technology Inc

Event Type

Workshop

TimeSunday, 12 November 20233:30pm - 4pm MST

Location505

ask a question

give feedback