SmartFL: Semantics Based Probabilistic Fault Localization

IF 5.6 1区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

IEEE Transactions on Software Engineering Pub Date : 2025-03-27 DOI:10.1109/TSE.2025.3574487

Yiqian Wu;Yujie Liu;Yi Yin;Muhan Zeng;Zhentao Ye;Xin Zhang;Yingfei Xiong;Lu Zhang

{"title":"SmartFL: Semantics Based Probabilistic Fault Localization","authors":"Yiqian Wu;Yujie Liu;Yi Yin;Muhan Zeng;Zhentao Ye;Xin Zhang;Yingfei Xiong;Lu Zhang","doi":"10.1109/TSE.2025.3574487","DOIUrl":null,"url":null,"abstract":"Testing-based fault localization has been a research focus in software engineering in the past decades. It localizes faulty program elements based on a set of passing and failing test executions. Since whether a fault could be triggered and detected by a test is related to program semantics, it is crucial to model program semantics in fault localization approaches. Existing approaches either consider the full semantics of the program (e.g., mutation-based fault localization and angelic debugging), leading to scalability issues, or ignore the semantics of the program (e.g., spectrum-based fault localization), leading to imprecise localization results. Our key idea is: by modeling only the correctness of program values but not their full semantics, a balance could be reached between effectiveness and scalability. To realize this idea, we introduce a probabilistic model by efficient approximation of program semantics and several techniques to address scalability challenges. Our approach, (<bold>Se<bold>Mantics b<bold>Ased p<bold>Robabilis<bold>Tic <bold>Fault <bold>Localization), is evaluated on a real-world dataset, Defects4J 2.0. The top-1 statement-level accuracy of our approach is 14%, which improves 130% over the best SBFL and MBFL methods. The average time cost is 205 seconds per fault, which is half of SBFL methods. After combining our approach with existing approaches using the CombineFL framework, the performance of the combined approach is significantly boosted by an average of 10% on top-1, top-3, and top-5 accuracy compared to state-of-the-art combination methods.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"51 7","pages":"2161-2180"},"PeriodicalIF":5.6000,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11016188/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

Testing-based fault localization has been a research focus in software engineering in the past decades. It localizes faulty program elements based on a set of passing and failing test executions. Since whether a fault could be triggered and detected by a test is related to program semantics, it is crucial to model program semantics in fault localization approaches. Existing approaches either consider the full semantics of the program (e.g., mutation-based fault localization and angelic debugging), leading to scalability issues, or ignore the semantics of the program (e.g., spectrum-based fault localization), leading to imprecise localization results. Our key idea is: by modeling only the correctness of program values but not their full semantics, a balance could be reached between effectiveness and scalability. To realize this idea, we introduce a probabilistic model by efficient approximation of program semantics and several techniques to address scalability challenges. Our approach, (SeMantics bAsed pRobabilisTic Fault Localization), is evaluated on a real-world dataset, Defects4J 2.0. The top-1 statement-level accuracy of our approach is 14%, which improves 130% over the best SBFL and MBFL methods. The average time cost is 205 seconds per fault, which is half of SBFL methods. After combining our approach with existing approaches using the CombineFL framework, the performance of the combined approach is significantly boosted by an average of 10% on top-1, top-3, and top-5 accuracy compared to state-of-the-art combination methods.

查看原文本刊更多论文

基于语义的概率故障定位

基于测试的故障定位是近几十年来软件工程领域的研究热点。它根据一组通过和失败的测试执行来定位有缺陷的程序元素。由于测试是否可以触发和检测到故障与程序语义有关，因此在故障定位方法中对程序语义进行建模至关重要。现有的方法要么考虑程序的完整语义（例如，基于突变的故障定位和天使调试），导致可伸缩性问题，要么忽略程序的语义（例如，基于频谱的故障定位），导致不精确的定位结果。我们的关键思想是：通过只对程序值的正确性建模，而不对其完整的语义建模，可以在有效性和可伸缩性之间达到平衡。为了实现这一想法，我们引入了一个概率模型，该模型通过有效地逼近程序语义和几种技术来解决可扩展性挑战。我们的方法（基于语义的概率故障定位）在一个真实的数据集缺陷4j 2.0上进行了评估。我们的方法的前1语句级准确率为14%，比最好的SBFL和MBFL方法提高了130%。每个故障的平均时间成本为205秒，是SBFL方法的一半。将我们的方法与使用组合efl框架的现有方法相结合后，与最先进的组合方法相比，组合方法的性能在前1、前3和前5精度上平均提高了10%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Software Engineering 工程技术-工程：电子与电气

CiteScore

9.70

自引率

10.80%

发文量

724

审稿时长

6 months

期刊介绍： IEEE Transactions on Software Engineering seeks contributions comprising well-defined theoretical results and empirical studies with potential impacts on software construction, analysis, or management. The scope of this Transactions extends from fundamental mechanisms to the development of principles and their application in specific environments. Specific topic areas include: a) Development and maintenance methods and models: Techniques and principles for specifying, designing, and implementing software systems, encompassing notations and process models. b) Assessment methods: Software tests, validation, reliability models, test and diagnosis procedures, software redundancy, design for error control, and measurements and evaluation of process and product aspects. c) Software project management: Productivity factors, cost models, schedule and organizational issues, and standards. d) Tools and environments: Specific tools, integrated tool environments, associated architectures, databases, and parallel and distributed processing issues. e) System issues: Hardware-software trade-offs. f) State-of-the-art surveys: Syntheses and comprehensive reviews of the historical development within specific areas of interest.