Hengyuan Liu , Zheng Li , Xiaolan Kang , Shumei Wu , Doyle Paul , Xiang Chen , Yong Liu
{"title":"SCOPE: Hybrid optimization strategy for higher-order mutation-based fault localization","authors":"Hengyuan Liu , Zheng Li , Xiaolan Kang , Shumei Wu , Doyle Paul , Xiang Chen , Yong Liu","doi":"10.1016/j.infsof.2025.107873","DOIUrl":null,"url":null,"abstract":"<div><h3>Context:</h3><div>Mutation-Based Fault Localization (MBFL) using Higher-Order Mutants (HOM) has achieved promising performance in multiple-fault programs by simulating more realistic faults. Despite its effectiveness, it can be extremely costly due to the execution of numerous HOMs. However, existing cost-optimization strategies mainly focus on first-order mutants (FOMs), without considering the dependency relationships between HOMs and multiple program entities.</div></div><div><h3>Objective:</h3><div>In this article, we propose a novel strategy called <em>S</em>mart <em>C</em>ost-<em>O</em>ptimization through dynamic <em>P</em>rediction and sampling <em>E</em>xecution (SCOPE). It aims to reduce costs while providing rich mutation analysis information.</div></div><div><h3>Methods:</h3><div>SCOPE contains two key components: a Smart HOM Sampler and a Mutant-Testing Predictor. The former pre-selects the most promising HOMs for each program entity to execute, based on their association with suspicious program entities. The latter employs machine learning to infer the impact of the remaining HOMs on tests using test execution data from selected HOMs, without the need for actual execution.</div></div><div><h3>Results:</h3><div>(1) SCOPE outperforms state-of-the-art optimization strategies, including SELECTIVE, SAMPLING, and PMT, regardless of sampling rate or MBFL formulas adopted. (2) SCOPE can reduce the number of involved HOMs by up to 90% without any loss in the performance of MBFL. (3) SCOPE outperforms baseline methods including SBFL, three optimized MBFL techniques (WSOME, SGS, HMBFL) and two deep learning-based fault localization techniques (CNNFL and RNNFL). (4) Ablation Experiment validates that the Smart HOM Sampler and the Mutant-Testing Predictor contribute positively to the effectiveness of SCOPE, with average improvements of 23.60% and 15.14% in <span><math><mrow><mi>T</mi><mi>O</mi><mi>P</mi></mrow></math></span>-1 and <span><math><mi>A</mi></math></span>-<span><math><mrow><mi>E</mi><mi>X</mi><mi>A</mi><mi>M</mi></mrow></math></span>. Additionally, machine learning model comparison for the Mutant-Testing Predictor reveals that compared to the Logistic Regression and Naive Bayes, Random Forest has better prediction performance.</div></div><div><h3>Conclusions:</h3><div>Evaluation on 135 real-world multiple-fault programs from the widely used benchmark Defects4J have shown the effectiveness of our proposed hybrid optimization strategy SCOPE for higher-order mutation-based fault localization.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"188 ","pages":"Article 107873"},"PeriodicalIF":4.3000,"publicationDate":"2025-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information and Software Technology","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950584925002125","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Context:
Mutation-Based Fault Localization (MBFL) using Higher-Order Mutants (HOM) has achieved promising performance in multiple-fault programs by simulating more realistic faults. Despite its effectiveness, it can be extremely costly due to the execution of numerous HOMs. However, existing cost-optimization strategies mainly focus on first-order mutants (FOMs), without considering the dependency relationships between HOMs and multiple program entities.
Objective:
In this article, we propose a novel strategy called Smart Cost-Optimization through dynamic Prediction and sampling Execution (SCOPE). It aims to reduce costs while providing rich mutation analysis information.
Methods:
SCOPE contains two key components: a Smart HOM Sampler and a Mutant-Testing Predictor. The former pre-selects the most promising HOMs for each program entity to execute, based on their association with suspicious program entities. The latter employs machine learning to infer the impact of the remaining HOMs on tests using test execution data from selected HOMs, without the need for actual execution.
Results:
(1) SCOPE outperforms state-of-the-art optimization strategies, including SELECTIVE, SAMPLING, and PMT, regardless of sampling rate or MBFL formulas adopted. (2) SCOPE can reduce the number of involved HOMs by up to 90% without any loss in the performance of MBFL. (3) SCOPE outperforms baseline methods including SBFL, three optimized MBFL techniques (WSOME, SGS, HMBFL) and two deep learning-based fault localization techniques (CNNFL and RNNFL). (4) Ablation Experiment validates that the Smart HOM Sampler and the Mutant-Testing Predictor contribute positively to the effectiveness of SCOPE, with average improvements of 23.60% and 15.14% in -1 and -. Additionally, machine learning model comparison for the Mutant-Testing Predictor reveals that compared to the Logistic Regression and Naive Bayes, Random Forest has better prediction performance.
Conclusions:
Evaluation on 135 real-world multiple-fault programs from the widely used benchmark Defects4J have shown the effectiveness of our proposed hybrid optimization strategy SCOPE for higher-order mutation-based fault localization.
期刊介绍:
Information and Software Technology is the international archival journal focusing on research and experience that contributes to the improvement of software development practices. The journal''s scope includes methods and techniques to better engineer software and manage its development. Articles submitted for review should have a clear component of software engineering or address ways to improve the engineering and management of software development. Areas covered by the journal include:
• Software management, quality and metrics,
• Software processes,
• Software architecture, modelling, specification, design and programming
• Functional and non-functional software requirements
• Software testing and verification & validation
• Empirical studies of all aspects of engineering and managing software development
Short Communications is a new section dedicated to short papers addressing new ideas, controversial opinions, "Negative" results and much more. Read the Guide for authors for more information.
The journal encourages and welcomes submissions of systematic literature studies (reviews and maps) within the scope of the journal. Information and Software Technology is the premiere outlet for systematic literature studies in software engineering.