GBSR: Graph-based suspiciousness refinement for improving fault localization

IF 3.7 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Systems and Software Pub Date : 2024-08-26 DOI:10.1016/j.jss.2024.112189

Zheng Li , Mingyu Li , Shumei Wu , Shunqing Xu , Xiang Chen , Yong Liu

{"title":"GBSR: Graph-based suspiciousness refinement for improving fault localization","authors":"Zheng Li , Mingyu Li , Shumei Wu , Shunqing Xu , Xiang Chen , Yong Liu","doi":"10.1016/j.jss.2024.112189","DOIUrl":null,"url":null,"abstract":"<div><p>Fault Localization (FL) is an important and time-consuming phase of software debugging. The essence of FL lies in the process of calculating the suspiciousness of different program entities (e.g., statements) and generating a ranking list to guide developers in their code inspection. Nonetheless, a prevalent challenge within existing FL methodologies is the propensity for program entities with analogous execution information to receive a similar suspiciousness. This phenomenon can lead to confusion among developers, thereby reducing the effectiveness of debugging significantly. To alleviate this issue, we introduce fine-grained contextual information (such as partial code structural, coverage, and features from mutation analysis) to enrich the characteristics of program entities. Graphical structures are proposed to organize such information, where the passed and failed tests are constructed separately with the consideration of their differential impacts. In order to support the analysis of multidimensional features and the representation of large-scale programs, the PageRank algorithm is adopted to compute each program entity’s weight. Rather than altering the fundamental FL process, we leverage these computed weights to refine the suspiciousness produced by various FL techniques, thereby providing developers with a more precise and actionable ranking of potential fault locations. The proposed strategy Graph-Based Suspiciousness Refinement (GBSR) is evaluated on 243 real-world faulty programs from the Defects4J. The results demonstrate that GBSR can improve the accuracy of various FL techniques. Specifically, for the refinement with traditional SBFL and MBFL techniques, the number of faults localized by the first position of the ranking list (<span><math><mrow><mi>T</mi><mi>o</mi><mi>p</mi></mrow></math></span>-1) is increased by 189% and 68%, respectively. Furthermore, GBSR can also boost the state-of-the-art learning-based FL technique Grace by achieving a 2.8% performance improvement in <span><math><mrow><mi>T</mi><mi>o</mi><mi>p</mi></mrow></math></span>-1.</p></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"218 ","pages":"Article 112189"},"PeriodicalIF":3.7000,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Systems and Software","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0164121224002334","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

Fault Localization (FL) is an important and time-consuming phase of software debugging. The essence of FL lies in the process of calculating the suspiciousness of different program entities (e.g., statements) and generating a ranking list to guide developers in their code inspection. Nonetheless, a prevalent challenge within existing FL methodologies is the propensity for program entities with analogous execution information to receive a similar suspiciousness. This phenomenon can lead to confusion among developers, thereby reducing the effectiveness of debugging significantly. To alleviate this issue, we introduce fine-grained contextual information (such as partial code structural, coverage, and features from mutation analysis) to enrich the characteristics of program entities. Graphical structures are proposed to organize such information, where the passed and failed tests are constructed separately with the consideration of their differential impacts. In order to support the analysis of multidimensional features and the representation of large-scale programs, the PageRank algorithm is adopted to compute each program entity’s weight. Rather than altering the fundamental FL process, we leverage these computed weights to refine the suspiciousness produced by various FL techniques, thereby providing developers with a more precise and actionable ranking of potential fault locations. The proposed strategy Graph-Based Suspiciousness Refinement (GBSR) is evaluated on 243 real-world faulty programs from the Defects4J. The results demonstrate that GBSR can improve the accuracy of various FL techniques. Specifically, for the refinement with traditional SBFL and MBFL techniques, the number of faults localized by the first position of the ranking list ( $T o p$ -1) is increased by 189% and 68%, respectively. Furthermore, GBSR can also boost the state-of-the-art learning-based FL technique Grace by achieving a 2.8% performance improvement in $T o p$ -1.

查看原文本刊更多论文

GBSR：基于图的可疑性细化，用于改进故障定位

故障定位（FL）是软件调试中一个重要而耗时的阶段。故障定位的精髓在于计算不同程序实体（如语句）的可疑程度，并生成一个排名列表，以指导开发人员进行代码检查。然而，现有 FL 方法面临的一个普遍挑战是，具有类似执行信息的程序实体往往具有相似的可疑度。这种现象会导致开发人员产生混淆，从而大大降低调试的有效性。为了缓解这一问题，我们引入了细粒度的上下文信息（如部分代码结构、覆盖率和突变分析的特征）来丰富程序实体的特征。我们提出了图形结构来组织这些信息，其中通过的测试和失败的测试分别构建，并考虑了它们的不同影响。为了支持多维特征的分析和大规模程序的表示，我们采用了 PageRank 算法来计算每个程序实体的权重。我们并不改变基本的 FL 流程，而是利用这些计算出的权重来完善各种 FL 技术产生的可疑度，从而为开发人员提供更精确、更可操作的潜在故障位置排名。我们在 Defects4J 中的 243 个真实世界故障程序上评估了所提出的基于图形的可疑度细化（GBSR）策略。结果表明，GBSR 可以提高各种 FL 技术的准确性。具体来说，在使用传统的 SBFL 和 MBFL 技术进行细化时，由排名表第一位（Top-1）定位的故障数量分别增加了 189% 和 68%。此外，GBSR 还能提高最先进的基于学习的 FL 技术 Grace 的性能，在 Top-1 中提高了 2.8%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Systems and Software 工程技术-计算机：理论方法

CiteScore

8.60

自引率

5.70%

发文量

193

审稿时长

16 weeks

期刊介绍： The Journal of Systems and Software publishes papers covering all aspects of software engineering and related hardware-software-systems issues. All articles should include a validation of the idea presented, e.g. through case studies, experiments, or systematic comparisons with other approaches already in practice. Topics of interest include, but are not limited to: •Methods and tools for, and empirical studies on, software requirements, design, architecture, verification and validation, maintenance and evolution •Agile, model-driven, service-oriented, open source and global software development •Approaches for mobile, multiprocessing, real-time, distributed, cloud-based, dependable and virtualized systems •Human factors and management concerns of software development •Data management and big data issues of software systems •Metrics and evaluation, data mining of software development resources •Business and economic aspects of software development processes The journal welcomes state-of-the-art surveys and reports of practical experience for all of these topics.