Close

Presentation

This content is available for: Tech Program Reg Pass. Upgrade Registration
Interference-Aware Multiplexing for Deep Learning in GPU Clusters: A Middleware Approach
DescriptionA common strategy for improving efficiency in training deep learning entails multiplexing tasks on a single GPU. To mitigate the interference caused by multiplexing, existing approaches primarily employ kernel-level solutions to regulate GPU kernel execution, or harness hardware-level techniques to explicitly restrict GPU streaming multiprocessors and memory. Nevertheless, none of them perform satisfactorily in optimizing the completion time of tasks.

In this paper, we present IADeep, a middleware solution designed to significantly improve multiplexing efficiency. The core concept is the co-optimization of task assignments within a cluster and interference mitigation on each device. IADeep coordinates the configuration of all co-located tasks in a less fine-grained fashion, effectively reducing interference and enhancing task training performance. Across the entire cluster, IADeep intelligently selects applications suitable for multiplexing to further amplify the advantages of optimizing task configurations. Evaluations on a 20 RTX 3090-GPU cluster demonstrate that IADeep can significantly outperform state-of-the-art multiplexing solutions.
Event Type
Paper
TimeTuesday, 14 November 20234:30pm - 5pm MST
Location401-402
Tags
Accelerators
Distributed Computing
Middleware and System Software
Performance Measurement, Modeling, and Tools
Post-Moore Computing
Registration Categories
TP
Reproducibility Badges