Some Automatically Generated Patches are More Likely to be Correct than Others: An Analysis of Defects4J Patch Features

2022 IEEE/ACM International Workshop on Automated Program Repair (APR) Pub Date : 2022-05-01 DOI:10.1145/3524459.3527348

G. Bennett, T. Hall, David Bowes

{"title":"Some Automatically Generated Patches are More Likely to be Correct than Others: An Analysis of Defects4J Patch Features","authors":"G. Bennett, T. Hall, David Bowes","doi":"10.1145/3524459.3527348","DOIUrl":null,"url":null,"abstract":"Defects4J is a popular dataset against which many Java Automatic Program Repair (APR) tools benchmark their performance. However, recent evidence suggests that some APR tools overfit to Defects4J, producing plausible patches which are incorrect. What we do not currently know is whether there is any commonality in the features of these plausible patches that turn out not to be correct. We compare the features of Defects4J's human written patches in terms of those correctly patched by existing APR tools and those incorrectly patched. We found that 48.4% of Defects4J v1.5 have been automatically patched by existing APR tools; of which only 28.9% have been correctly patched leaving 19.5% incorrectly patched. We found patches of defects that added a method call, added a variable, or wrapped existing code with new code, such as a try/catch block were significantly associated with incorrect patches. Editing only a single line was significantly associated with correct patches. Our results suggest that current tools are weak at generating multi-line patches and synthesising new code especially when wrapping existing code. Our results highlight potential future areas of development for new APR approaches, such as developing a tool that effectively repairs defects that require a try/catch block. Our replication Package is available online11Replication Package available at: https://github.com/IncorrectDefects/ReplicationPackage.","PeriodicalId":131481,"journal":{"name":"2022 IEEE/ACM International Workshop on Automated Program Repair (APR)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM International Workshop on Automated Program Repair (APR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3524459.3527348","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Defects4J is a popular dataset against which many Java Automatic Program Repair (APR) tools benchmark their performance. However, recent evidence suggests that some APR tools overfit to Defects4J, producing plausible patches which are incorrect. What we do not currently know is whether there is any commonality in the features of these plausible patches that turn out not to be correct. We compare the features of Defects4J's human written patches in terms of those correctly patched by existing APR tools and those incorrectly patched. We found that 48.4% of Defects4J v1.5 have been automatically patched by existing APR tools; of which only 28.9% have been correctly patched leaving 19.5% incorrectly patched. We found patches of defects that added a method call, added a variable, or wrapped existing code with new code, such as a try/catch block were significantly associated with incorrect patches. Editing only a single line was significantly associated with correct patches. Our results suggest that current tools are weak at generating multi-line patches and synthesising new code especially when wrapping existing code. Our results highlight potential future areas of development for new APR approaches, such as developing a tool that effectively repairs defects that require a try/catch block. Our replication Package is available online11Replication Package available at: https://github.com/IncorrectDefects/ReplicationPackage.

查看原文本刊更多论文

一些自动生成的补丁比其他的更可能是正确的:对缺陷4j补丁特性的分析

缺陷4j是一个流行的数据集，许多Java自动程序修复(APR)工具都以此为基准对其性能进行基准测试。然而，最近的证据表明，一些APR工具过于适合缺陷4j，产生了不正确的貌似合理的补丁。我们目前不知道的是，这些看似合理的补丁的特征中是否存在任何共性，而这些共性最终被证明是不正确的。我们比较了缺陷4j的人工编写补丁的特性，根据现有APR工具正确修补的特性和不正确修补的特性。我们发现48.4%的缺陷4j v1.5已经被现有的APR工具自动修补;其中只有28.9%的补丁是正确的，剩下19.5%的补丁是错误的。我们发现了缺陷的补丁，这些缺陷添加了一个方法调用，添加了一个变量，或者用新代码包装了现有的代码，比如一个try/catch块，这些缺陷明显与不正确的补丁相关联。只编辑单行与正确的补丁显著相关。我们的结果表明，当前的工具在生成多行补丁和合成新代码方面很弱，特别是在包装现有代码时。我们的结果突出了新的APR方法的潜在未来发展领域，例如开发一种工具，可以有效地修复需要尝试/捕获块的缺陷。我们的复制包可在网上获得。复制包可在:https://github.com/IncorrectDefects/ReplicationPackage。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE/ACM International Workshop on Automated Program Repair (APR)

自引率

0.00%

发文量