Yi Song, Xiaoyuan Xie, Xihao Zhang, Quanming Liu, Ruizhi Gao
{"title":"基于进化排序的故障接近度在故障隔离中的应用","authors":"Yi Song, Xiaoyuan Xie, Xihao Zhang, Quanming Liu, Ruizhi Gao","doi":"10.1145/3551349.3556922","DOIUrl":null,"url":null,"abstract":"Failures that are not related to a specific fault can reduce the effectiveness of fault localization in multi-fault scenarios. To tackle this challenge, researchers and practitioners typically cluster failures (e.g., failed test cases) into several disjoint groups, with those caused by the same fault grouped together. In such a fault isolation process that requires input in a mathematical form, ranking-based failure proximity (R-proximity) is widely used to model failed test cases. In R-proximity, each failed test case is represented as a suspiciousness ranking list of program statements through a fingerprinting function (i.e., a risk evaluation formula, REF). Although many off-the-shelf REFs have been integrated into R-proximity, they were designed for single-fault localization originally. To the best of our knowledge, no REF has been developed to serve as a fingerprinting function of R-proximity in multi-fault scenarios. For better clustering failures in fault isolation, in this paper, we present a genetic programming-based framework along with a sophisticated fitness function, for evolving REFs with the goal of more properly representing failures in multi-fault scenarios. By using a small set of programs for training, we get a collection of REFs that can obtain good results applicable in a larger and more general scale of scenarios. The best one of them outperforms the state-of-the-art by 50.72% and 47.41% in faults number estimation and clustering effectiveness, respectively. Our framework is highly configurable for further use, and the evolved formulas can be directly applied in future failure representation tasks without any retraining.","PeriodicalId":197939,"journal":{"name":"Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Evolving Ranking-Based Failure Proximities for Better Clustering in Fault Isolation\",\"authors\":\"Yi Song, Xiaoyuan Xie, Xihao Zhang, Quanming Liu, Ruizhi Gao\",\"doi\":\"10.1145/3551349.3556922\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Failures that are not related to a specific fault can reduce the effectiveness of fault localization in multi-fault scenarios. To tackle this challenge, researchers and practitioners typically cluster failures (e.g., failed test cases) into several disjoint groups, with those caused by the same fault grouped together. In such a fault isolation process that requires input in a mathematical form, ranking-based failure proximity (R-proximity) is widely used to model failed test cases. In R-proximity, each failed test case is represented as a suspiciousness ranking list of program statements through a fingerprinting function (i.e., a risk evaluation formula, REF). Although many off-the-shelf REFs have been integrated into R-proximity, they were designed for single-fault localization originally. To the best of our knowledge, no REF has been developed to serve as a fingerprinting function of R-proximity in multi-fault scenarios. For better clustering failures in fault isolation, in this paper, we present a genetic programming-based framework along with a sophisticated fitness function, for evolving REFs with the goal of more properly representing failures in multi-fault scenarios. By using a small set of programs for training, we get a collection of REFs that can obtain good results applicable in a larger and more general scale of scenarios. The best one of them outperforms the state-of-the-art by 50.72% and 47.41% in faults number estimation and clustering effectiveness, respectively. Our framework is highly configurable for further use, and the evolved formulas can be directly applied in future failure representation tasks without any retraining.\",\"PeriodicalId\":197939,\"journal\":{\"name\":\"Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering\",\"volume\":\"18 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3551349.3556922\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3551349.3556922","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Evolving Ranking-Based Failure Proximities for Better Clustering in Fault Isolation
Failures that are not related to a specific fault can reduce the effectiveness of fault localization in multi-fault scenarios. To tackle this challenge, researchers and practitioners typically cluster failures (e.g., failed test cases) into several disjoint groups, with those caused by the same fault grouped together. In such a fault isolation process that requires input in a mathematical form, ranking-based failure proximity (R-proximity) is widely used to model failed test cases. In R-proximity, each failed test case is represented as a suspiciousness ranking list of program statements through a fingerprinting function (i.e., a risk evaluation formula, REF). Although many off-the-shelf REFs have been integrated into R-proximity, they were designed for single-fault localization originally. To the best of our knowledge, no REF has been developed to serve as a fingerprinting function of R-proximity in multi-fault scenarios. For better clustering failures in fault isolation, in this paper, we present a genetic programming-based framework along with a sophisticated fitness function, for evolving REFs with the goal of more properly representing failures in multi-fault scenarios. By using a small set of programs for training, we get a collection of REFs that can obtain good results applicable in a larger and more general scale of scenarios. The best one of them outperforms the state-of-the-art by 50.72% and 47.41% in faults number estimation and clustering effectiveness, respectively. Our framework is highly configurable for further use, and the evolved formulas can be directly applied in future failure representation tasks without any retraining.