Close

Presentation

This content is available for: Workshop Reg Pass. Upgrade Registration
Kubeflow-as-a-Service on HPC clusters – First Experiences
DescriptionDevelopment platforms specific to domain sciences has the potential to improve user's productivity on a HPC cluster by smoothing the steep learning curve using it. These platforms also help abstracting certain practices the user must implement to get the optimal performance out of the allocated resource. These objectives require pre-work, both on the systems side and at the application level. The presentation discusses the first experiences of prototyping with Kubeflow and deploying it as-a-service to be shared by multiple users. The deployment was designed with HPC cluster or multi-node cloud instances as target computational resources. Kubeflow is an opensource platform to make deployment of ML/DL workloads easy. It depends on Kubernetes. Kubeflow offers a simple UI for interactive computing, orchestration of workflows using Kubeflow pipeline, and an intuitive interface for hyperparameter tuning experiments using Katib. These are attractive features when considering the ease of use in deploying software environments for model and workflow development for users in academic research settings on cloud and university HPC cluster.
Event Type
Workshop
TimeMonday, 13 November 20239:40am - 9:45am MST
Location607
Registration Categories
W