{"title":"FluxInfer: Automatic Diagnosis of Performance Anomaly for Online Database System","authors":"Ping Liu, Shenglin Zhang, Yongqian Sun, Yuan Meng, Jiahai Yang, Dan Pei","doi":"10.1109/IPCCC50635.2020.9391550","DOIUrl":null,"url":null,"abstract":"The root cause diagnosis of performance anomaly for online database anomalies is challenging due to diverse types of database engines, different operational modes, and variable anomaly patterns. To relieve database operators from manual anomaly diagnosis and alarm storm, we propose FluxInfer, a framework to accurately and rapidly localize root cause related KPIs for database performance anomaly. It first constructs a Weighted Undirected Dependency Graph (WUDG) to represent the dependency relationships of anomalous KPIs accurately, and then applies a weighted PageRank algorithm to localize root cause related KPIs. The testbed evaluation experiments show that the AC@3, AC@5, and Avg@5 of FluxInfer are 0.90, 0.95, and 0.77, outperforming nine baselines by 64%, 60%, and 53% on average, respectively.","PeriodicalId":226034,"journal":{"name":"2020 IEEE 39th International Performance Computing and Communications Conference (IPCCC)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 39th International Performance Computing and Communications Conference (IPCCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPCCC50635.2020.9391550","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
The root cause diagnosis of performance anomaly for online database anomalies is challenging due to diverse types of database engines, different operational modes, and variable anomaly patterns. To relieve database operators from manual anomaly diagnosis and alarm storm, we propose FluxInfer, a framework to accurately and rapidly localize root cause related KPIs for database performance anomaly. It first constructs a Weighted Undirected Dependency Graph (WUDG) to represent the dependency relationships of anomalous KPIs accurately, and then applies a weighted PageRank algorithm to localize root cause related KPIs. The testbed evaluation experiments show that the AC@3, AC@5, and Avg@5 of FluxInfer are 0.90, 0.95, and 0.77, outperforming nine baselines by 64%, 60%, and 53% on average, respectively.