Close

Presentation

This content is available for: Workshop Reg Pass. Upgrade Registration
Dynamic Memory Provisioning on Disaggregated HPC Systems
DescriptionDisaggregated memory intends to break the rigid boundaries between node memory hierarchies by providing memory as a pooled resource. The resource manager allocates system’s memory at job’s submission time. But it is hard for users to know the job's precise peak memory footprint, and prior work has shown users have an incentive to overestimate. It leads to significant overallocation, and most of the physical memory in the system is wasted. We present a way to reclaim much of this overallocated memory. We extend the Slurm job scheduler to dynamically reallocate memory, according to the job’s current memory footprint. We enhance an existing Slurm simulator to model this situation and combine publicly available traces to model an HPC system on up to 1490 nodes. We show that dynamic memory provisioning approach increases the throughput per dollar by up to 38%, compared to a system with static allocation of disaggregated memory.
Event Type
Workshop
TimeMonday, 13 November 20235pm - 5:30pm MST
Location601
Tags
Applications
Data Movement and Memory
Heterogeneous Computing
I/O and File Systems
Large Scale Systems
Middleware and System Software
Performance Measurement, Modeling, and Tools
Performance Optimization
Registration Categories
W