Extracting Concise Bug-Fixing Patches from Human-Written Patches in Version Control Systems

Yanjie Jiang, Hui Liu, Nan Niu, Lu Zhang, Yamin Hu
{"title":"Extracting Concise Bug-Fixing Patches from Human-Written Patches in Version Control Systems","authors":"Yanjie Jiang, Hui Liu, Nan Niu, Lu Zhang, Yamin Hu","doi":"10.1109/ICSE43902.2021.00069","DOIUrl":null,"url":null,"abstract":"High-quality and large-scale repositories of real bugs and their concise patches collected from real-world applications are critical for research in software engineering community. In such a repository, each real bug is explicitly associated with its fix. Therefore, on one side, the real bugs and their fixes may inspire novel approaches for finding, locating, and repairing software bugs; on the other side, the real bugs and their fixes are indispensable for rigorous and meaningful evaluation of approaches for software testing, fault localization, and program repair. To this end, a number of such repositories, e.g., Defects4J, have been proposed. However, such repositories are rather small because their construction involves expensive human intervention. Although bug-fixing code commits as well as associated test cases could be retrieved from version control systems automatically, existing approaches could not yet automatically extract concise bug-fixing patches from bug-fixing commits because such commits often involve bug-irrelevant changes. In this paper, we propose an automatic approach, called BugBuilder, to extracting complete and concise bug-fixing patches from human-written patches in version control systems. It excludes refactorings by detecting refactorings involved in bug-fixing commits, and reapplying detected refactorings on the faulty version. It enumerates all subsets of the remaining part and validates them on test cases. If none of the subsets has the potential to be a complete bug-fixing patch, the remaining part as a whole is taken as a complete and concise bug-fixing patch. Evaluation results on 809 real bug-fixing commits in Defects4J suggest that BugBuilder successfully generated complete and concise bug-fixing patches for forty percent of the bug-fixing commits, and its precision (99%) was even higher than human experts.","PeriodicalId":305167,"journal":{"name":"2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSE43902.2021.00069","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 21

Abstract

High-quality and large-scale repositories of real bugs and their concise patches collected from real-world applications are critical for research in software engineering community. In such a repository, each real bug is explicitly associated with its fix. Therefore, on one side, the real bugs and their fixes may inspire novel approaches for finding, locating, and repairing software bugs; on the other side, the real bugs and their fixes are indispensable for rigorous and meaningful evaluation of approaches for software testing, fault localization, and program repair. To this end, a number of such repositories, e.g., Defects4J, have been proposed. However, such repositories are rather small because their construction involves expensive human intervention. Although bug-fixing code commits as well as associated test cases could be retrieved from version control systems automatically, existing approaches could not yet automatically extract concise bug-fixing patches from bug-fixing commits because such commits often involve bug-irrelevant changes. In this paper, we propose an automatic approach, called BugBuilder, to extracting complete and concise bug-fixing patches from human-written patches in version control systems. It excludes refactorings by detecting refactorings involved in bug-fixing commits, and reapplying detected refactorings on the faulty version. It enumerates all subsets of the remaining part and validates them on test cases. If none of the subsets has the potential to be a complete bug-fixing patch, the remaining part as a whole is taken as a complete and concise bug-fixing patch. Evaluation results on 809 real bug-fixing commits in Defects4J suggest that BugBuilder successfully generated complete and concise bug-fixing patches for forty percent of the bug-fixing commits, and its precision (99%) was even higher than human experts.
从版本控制系统中人工编写的补丁中提取简明的bug修复补丁
从实际应用程序中收集的高质量和大规模的真实错误存储库及其简明补丁对于软件工程社区的研究至关重要。在这样的存储库中,每个真正的bug都显式地与其修复相关联。因此,一方面,真实的错误和它们的修复可能会激发寻找、定位和修复软件错误的新方法;另一方面,对于软件测试、故障定位和程序修复方法的严格和有意义的评估来说,真实的错误及其修复是不可或缺的。为此,已经提出了许多这样的存储库,例如,Defects4J。然而,这样的存储库相当小,因为它们的构建涉及昂贵的人工干预。虽然bug修复代码提交以及相关的测试用例可以自动地从版本控制系统中检索,但是现有的方法还不能自动地从bug修复提交中提取简洁的bug修复补丁,因为这样的提交通常涉及与bug无关的更改。在本文中,我们提出了一种称为BugBuilder的自动方法,用于从版本控制系统中人工编写的补丁中提取完整而简洁的bug修复补丁。它通过检测修复错误提交中涉及的重构,并在有问题的版本上重新应用检测到的重构,从而排除重构。它列举剩余部分的所有子集,并在测试用例上验证它们。如果这些子集中没有一个具有成为完整的bug修复补丁的潜力,那么其余部分作为一个整体将被视为一个完整而简洁的bug修复补丁。对缺陷4j中809个真正的bug修复提交的评估结果表明,BugBuilder成功地为40%的bug修复提交生成了完整而简洁的bug修复补丁,其精度(99%)甚至高于人类专家。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信