Close

Presentation

This content is available for: Workshop Reg Pass. Upgrade Registration
Fine-Grained Accelerator Partitioning for Machine Learning and Scientific Computing in Function as a Service Platform
DescriptionFunction-as-a-service (FaaS) is a promising execution environment for high-performance computing (HPC) and machine learning (ML) applications, as it offers developers a simple way to write and deploy programs. Nowadays, GPUs and other accelerators are indispensable for HPC and ML workloads. However, we have observed that state-of-the-art FaaS frameworks usually treat accelerators as a single device to run a single workload and have little support for multiplexing accelerators.

In this work, we have presented techniques to multiplex GPUs with Parsl, a popular FaaS framework. With our enhancements, we show up to 60% lower task completion time and 250% improvement in the throughput of a large language model when multiplexing a GPU vs running without multiplexing. We plan to extend the support for GPU multiplexing in FaaS platforms by tackling the challenges of changing compute resources in the partition and approximating how to right-size a GPU partition for a function.
Event Type
Workshop
TimeSunday, 12 November 202311:42am - 12:06pm MST
Location704-706
Tags
Middleware and System Software
Programming Frameworks and System Software
Runtime Systems
Registration Categories
W