评估 SZZ 实现：Linux 内核实证研究

IF 6.5 1区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

IEEE Transactions on Software Engineering Pub Date : 2024-03-29 DOI:10.1109/TSE.2024.3406718

Yunbo Lyu;Hong Jin Kang;Ratnadira Widyasari;Julia Lawall;David Lo

{"title":"评估 SZZ 实现：Linux 内核实证研究","authors":"Yunbo Lyu;Hong Jin Kang;Ratnadira Widyasari;Julia Lawall;David Lo","doi":"10.1109/TSE.2024.3406718","DOIUrl":null,"url":null,"abstract":"The SZZ algorithm is used to connect bug-fixing commits to the earlier commits that introduced bugs. This algorithm has many applications and many variants have been devised. However, there are some types of commits that cannot be traced by the SZZ algorithm, referred to as “ghost commits”. The evaluation of how these ghost commits impact the SZZ implementations remains limited. Moreover, these implementations have been evaluated on datasets created by software engineering researchers from information in bug trackers and version controlled histories. Since Oct 2013, the Linux kernel developers have started labelling bug-fixing patches with the commit identifiers of the corresponding bug-inducing commit(s) as a standard practice. As of v6.1-rc5, 76,046 pairs of bug-fixing patches and bug-inducing commits are available. This provides a unique opportunity to evaluate the SZZ algorithm on a large dataset that has been created and reviewed by project developers, entirely independently of the biases of software engineering researchers. In this paper, we apply six SZZ implementations to 76,046 pairs of bug-fixing patches and bug-introducing commits from the Linux kernel. Our findings reveal that SZZ algorithms experience a more significant decline in recall on our dataset (\n<inline-formula><tex-math>$\\downarrow 13.8\\%$</tex-math></inline-formula>\n) as compared to prior findings reported by Rosa et al., and the disparities between the individual SZZ algorithms diminish. Moreover, we find that 17.47% of bug-fixing commits are ghost commits. Finally, we propose Tracing-Commit SZZ (TC-SZZ), that traces all commits in the change history of lines modified or deleted in bug-fixing commits. Applying TC-SZZ to all failure cases, excluding ghost commits, we found that TC-SZZ could identify 17.7% of them. Our further analysis based on \n<i>git log</i>\n found that 34.6% of bug-inducing commits were in the function history, 27.5% in the file history (but not in the function history), and 37.9% not in the file history. We further evaluated the effectiveness of ChatGPT in boosting the SZZ algorithm's ability to identify bug-inducing commits in the function history, in the file history and not in the file history.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":null,"pages":null},"PeriodicalIF":6.5000,"publicationDate":"2024-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Evaluating SZZ Implementations: An Empirical Study on the Linux Kernel\",\"authors\":\"Yunbo Lyu;Hong Jin Kang;Ratnadira Widyasari;Julia Lawall;David Lo\",\"doi\":\"10.1109/TSE.2024.3406718\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The SZZ algorithm is used to connect bug-fixing commits to the earlier commits that introduced bugs. This algorithm has many applications and many variants have been devised. However, there are some types of commits that cannot be traced by the SZZ algorithm, referred to as “ghost commits”. The evaluation of how these ghost commits impact the SZZ implementations remains limited. Moreover, these implementations have been evaluated on datasets created by software engineering researchers from information in bug trackers and version controlled histories. Since Oct 2013, the Linux kernel developers have started labelling bug-fixing patches with the commit identifiers of the corresponding bug-inducing commit(s) as a standard practice. As of v6.1-rc5, 76,046 pairs of bug-fixing patches and bug-inducing commits are available. This provides a unique opportunity to evaluate the SZZ algorithm on a large dataset that has been created and reviewed by project developers, entirely independently of the biases of software engineering researchers. In this paper, we apply six SZZ implementations to 76,046 pairs of bug-fixing patches and bug-introducing commits from the Linux kernel. Our findings reveal that SZZ algorithms experience a more significant decline in recall on our dataset (\\n<inline-formula><tex-math>$\\\\downarrow 13.8\\\\%$</tex-math></inline-formula>\\n) as compared to prior findings reported by Rosa et al., and the disparities between the individual SZZ algorithms diminish. Moreover, we find that 17.47% of bug-fixing commits are ghost commits. Finally, we propose Tracing-Commit SZZ (TC-SZZ), that traces all commits in the change history of lines modified or deleted in bug-fixing commits. Applying TC-SZZ to all failure cases, excluding ghost commits, we found that TC-SZZ could identify 17.7% of them. Our further analysis based on \\n<i>git log</i>\\n found that 34.6% of bug-inducing commits were in the function history, 27.5% in the file history (but not in the function history), and 37.9% not in the file history. We further evaluated the effectiveness of ChatGPT in boosting the SZZ algorithm's ability to identify bug-inducing commits in the function history, in the file history and not in the file history.\",\"PeriodicalId\":13324,\"journal\":{\"name\":\"IEEE Transactions on Software Engineering\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":6.5000,\"publicationDate\":\"2024-03-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Software Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10541859/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10541859/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

摘要

SZZ 算法用于将修复错误的提交与引入错误的早期提交连接起来。这种算法有很多应用，也有很多变种。然而，SZZ 算法无法追踪某些类型的提交，这些提交被称为 "幽灵提交"。对这些幽灵提交如何影响 SZZ 实现的评估仍然有限。此外，这些实现是在软件工程研究人员根据错误跟踪器和版本控制历史中的信息创建的数据集上进行评估的。自 2013 年 10 月起，Linux 内核开发人员开始在修复漏洞的补丁上标注相应的引发漏洞的提交标识符，并将此作为一种标准做法。截至 v6.1-rc5，已有 76046 对修复漏洞的补丁和引发漏洞的提交。这为我们提供了一个独一无二的机会，在完全不受软件工程研究人员偏见影响的情况下，在由项目开发人员创建和审核的大型数据集上评估 SZZ 算法。在本文中，我们将六种 SZZ 实现应用于 Linux 内核中的 76046 对缺陷修复补丁和缺陷引入提交。我们的研究结果表明，与 Rosa 等人之前的研究结果相比，SZZ 算法在我们的数据集上的召回率出现了更显著的下降（$/downarrow 13.8/%$），而且单个 SZZ 算法之间的差异也在缩小。此外，我们还发现有 17.47% 的错误修复提交是幽灵提交。最后，我们提出了追踪-提交 SZZ（TC-SZZ），它可以追踪错误修复提交中修改或删除行的变更历史中的所有提交。将 TC-SZZ 应用于所有失败案例（不包括幽灵提交），我们发现 TC-SZZ 可以识别出 17.7% 的失败案例。基于 git 日志的进一步分析发现，34.6% 的错误诱发提交在函数历史中，27.5% 在文件历史中（但不在函数历史中），37.9% 不在文件历史中。我们进一步评估了 ChatGPT 在提高 SZZ 算法识别函数历史、文件历史和非文件历史中的错误诱导提交能力方面的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Evaluating SZZ Implementations: An Empirical Study on the Linux Kernel

The SZZ algorithm is used to connect bug-fixing commits to the earlier commits that introduced bugs. This algorithm has many applications and many variants have been devised. However, there are some types of commits that cannot be traced by the SZZ algorithm, referred to as “ghost commits”. The evaluation of how these ghost commits impact the SZZ implementations remains limited. Moreover, these implementations have been evaluated on datasets created by software engineering researchers from information in bug trackers and version controlled histories. Since Oct 2013, the Linux kernel developers have started labelling bug-fixing patches with the commit identifiers of the corresponding bug-inducing commit(s) as a standard practice. As of v6.1-rc5, 76,046 pairs of bug-fixing patches and bug-inducing commits are available. This provides a unique opportunity to evaluate the SZZ algorithm on a large dataset that has been created and reviewed by project developers, entirely independently of the biases of software engineering researchers. In this paper, we apply six SZZ implementations to 76,046 pairs of bug-fixing patches and bug-introducing commits from the Linux kernel. Our findings reveal that SZZ algorithms experience a more significant decline in recall on our dataset (

$\downarrow 13.8\%$

) as compared to prior findings reported by Rosa et al., and the disparities between the individual SZZ algorithms diminish. Moreover, we find that 17.47% of bug-fixing commits are ghost commits. Finally, we propose Tracing-Commit SZZ (TC-SZZ), that traces all commits in the change history of lines modified or deleted in bug-fixing commits. Applying TC-SZZ to all failure cases, excluding ghost commits, we found that TC-SZZ could identify 17.7% of them. Our further analysis based on git log found that 34.6% of bug-inducing commits were in the function history, 27.5% in the file history (but not in the function history), and 37.9% not in the file history. We further evaluated the effectiveness of ChatGPT in boosting the SZZ algorithm's ability to identify bug-inducing commits in the function history, in the file history and not in the file history.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Software Engineering 工程技术-工程：电子与电气

CiteScore

9.70

自引率

10.80%

发文量

724

审稿时长

6 months

期刊介绍： IEEE Transactions on Software Engineering seeks contributions comprising well-defined theoretical results and empirical studies with potential impacts on software construction, analysis, or management. The scope of this Transactions extends from fundamental mechanisms to the development of principles and their application in specific environments. Specific topic areas include: a) Development and maintenance methods and models: Techniques and principles for specifying, designing, and implementing software systems, encompassing notations and process models. b) Assessment methods: Software tests, validation, reliability models, test and diagnosis procedures, software redundancy, design for error control, and measurements and evaluation of process and product aspects. c) Software project management: Productivity factors, cost models, schedule and organizational issues, and standards. d) Tools and environments: Specific tools, integrated tool environments, associated architectures, databases, and parallel and distributed processing issues. e) System issues: Hardware-software trade-offs. f) State-of-the-art surveys: Syntheses and comprehensive reviews of the historical development within specific areas of interest.