Close

Presentation

This content is available for: Workshop Reg Pass. Upgrade Registration
Fluxion: A Scalable Graph-Based Resource Model for HPC Scheduling Challenges
DescriptionThe current era of exascale supercomputing and the emergence of a computing continuum present several significant resource management challenges. These include, but are not limited to, management of complex scientific workflows, diverse resources such as power, elasticity in user jobs, and converged environments. The resource models that underpin today's job scheduling frameworks reflect the node- (or core-) centric system architectures prevalent when the frameworks were designed. Consequently, they are not suited to capturing resource relationships or dynamism. This greatly limits their applicability to the emerging multifaceted challenges in high-performance computing (HPC) and other converged environments. We propose a scalable graph-based resource model to overcome these challenges, which allows for representation of complex, changing resource relationships and multiple containment hierarchies. We implement this model, Fluxion, in a production-quality framework, and evaluate its performance. Additionally, we present emerging and advanced scheduling use cases that are enabled by our model.
Event Type
Workshop
TimeMonday, 13 November 202311:57am - 12:15pm MST
Location704-706
Tags
Data Analysis, Visualization, and Storage
Large Scale Systems
Programming Frameworks and System Software
Reproducibility
Resource Management
Runtime Systems
Registration Categories
W