Close

Presentation

This content is available for: Tech Program Reg Pass, Exhibits Reg Pass. Upgrade Registration
Graph Based Anomaly Detection in Chimbuko: Feasible or Fallible?
DescriptionPerformance anomaly detection can aid in discovering algorithmic inefficiencies or hardware issues in an application’s environment. The Chimbuko framework monitors large-scale workflow applications in real-time and identifies function executions which deviate from accumulated statistics (performance anomalies). Performance anomalies across runs correlate with variation in execution times of an application; quicker resolution of performance anomalies caused by hardware issues improves cluster performance. Anomalous and normal executions are stored as events in Chimbuko. In this study, we investigate the applicability of graph-based deep learning methods for anomaly classification. We hypothesize that transforming data into a graph will allow correlations to be modeled, thus allowing graph-based methods to learn embeddings that can improve the effectiveness of downstream anomaly classification tasks. Our evaluations demonstrate that the graph-based methods yield up to 95% accuracy and outperform a state-of-the-art gradient-based method. Moreover, we provide an explanation of the classification model’s decision-making process through explainable AI techniques.
Event Type
Posters
Research Posters
TimeTuesday, 14 November 202310am - 5pm MST
Registration Categories
TP
XO/EX