A method to identify overfitting program repair patches based on expression tree

IF 1.5 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING
Yukun Dong, Xiaotong Cheng, Yufei Yang, Lulu Zhang, Shuqi Wang, Lingjie Kong
{"title":"A method to identify overfitting program repair patches based on expression tree","authors":"Yukun Dong,&nbsp;Xiaotong Cheng,&nbsp;Yufei Yang,&nbsp;Lulu Zhang,&nbsp;Shuqi Wang,&nbsp;Lingjie Kong","doi":"10.1016/j.scico.2024.103105","DOIUrl":null,"url":null,"abstract":"<div><p>The primary aim of Automatic Program Repair (APR) is to automatically repair defective programs, with the intention of reducing the amount of effort required by developers. However, APR techniques may produce overfitting patches that do not truly repair the program, allowing the program to pass all test cases. This paper provides a comprehensive review of the overfitting problem and adds to the existing research on overfitting in conditional statements. Our proposed method, ETPAT (Expression Tree-based Patch Assessment Technique), implements expression trees and targeted coverage criteria to identify differences between the original and the patched program. We utilize ETPAT to verify test case adequacy. In parallel, ETPAT also guides the generation of corresponding test cases via equivalence class information, which may be added to the original test suite, making it more robust while also preventing the repair technique from generating comparable overfitting patches. With reference to the patch set in the BuggyJavaJML benchmark, ETPAT recognized 77/82 (93.9%) overfitting patches out of 120 patches related to conditional constraints, displaying superior accuracy rates and fewer test cases required than the original repair tool.</p></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":null,"pages":null},"PeriodicalIF":1.5000,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Science of Computer Programming","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167642324000285","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0

Abstract

The primary aim of Automatic Program Repair (APR) is to automatically repair defective programs, with the intention of reducing the amount of effort required by developers. However, APR techniques may produce overfitting patches that do not truly repair the program, allowing the program to pass all test cases. This paper provides a comprehensive review of the overfitting problem and adds to the existing research on overfitting in conditional statements. Our proposed method, ETPAT (Expression Tree-based Patch Assessment Technique), implements expression trees and targeted coverage criteria to identify differences between the original and the patched program. We utilize ETPAT to verify test case adequacy. In parallel, ETPAT also guides the generation of corresponding test cases via equivalence class information, which may be added to the original test suite, making it more robust while also preventing the repair technique from generating comparable overfitting patches. With reference to the patch set in the BuggyJavaJML benchmark, ETPAT recognized 77/82 (93.9%) overfitting patches out of 120 patches related to conditional constraints, displaying superior accuracy rates and fewer test cases required than the original repair tool.

基于表达树识别过度拟合程序修复补丁的方法
自动程序修复(APR)的主要目的是自动修复有缺陷的程序,以减少开发人员的工作量。然而,自动程序修复技术可能会产生过拟合补丁,无法真正修复程序,使程序通过所有测试用例。本文全面回顾了过拟合问题,并对现有的条件语句过拟合研究进行了补充。我们提出的 ETPAT(基于表达式树的补丁评估技术)方法采用表达式树和目标覆盖标准来识别原始程序和补丁程序之间的差异。我们利用 ETPAT 验证测试用例的充分性。与此同时,ETPAT 还能通过等价类信息指导生成相应的测试用例,这些测试用例可添加到原始测试套件中,使其更加稳健,同时还能防止修复技术生成类似的过拟合补丁。参照 BuggyJavaJML 基准中的补丁集,ETPAT 在 120 个与条件约束相关的补丁中识别出 77/82 个(93.9%)过拟合补丁,显示出比原始修复工具更高的准确率和更少的所需测试用例。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Science of Computer Programming
Science of Computer Programming 工程技术-计算机:软件工程
CiteScore
3.80
自引率
0.00%
发文量
76
审稿时长
67 days
期刊介绍: Science of Computer Programming is dedicated to the distribution of research results in the areas of software systems development, use and maintenance, including the software aspects of hardware design. The journal has a wide scope ranging from the many facets of methodological foundations to the details of technical issues andthe aspects of industrial practice. The subjects of interest to SCP cover the entire spectrum of methods for the entire life cycle of software systems, including • Requirements, specification, design, validation, verification, coding, testing, maintenance, metrics and renovation of software; • Design, implementation and evaluation of programming languages; • Programming environments, development tools, visualisation and animation; • Management of the development process; • Human factors in software, software for social interaction, software for social computing; • Cyber physical systems, and software for the interaction between the physical and the machine; • Software aspects of infrastructure services, system administration, and network management.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信