对自动程序修复的系统研究:修复105个错误中的55个,每个8美元

Claire Le Goues, Michael Dewey-Vogt, S. Forrest, Westley Weimer
{"title":"对自动程序修复的系统研究:修复105个错误中的55个,每个8美元","authors":"Claire Le Goues, Michael Dewey-Vogt, S. Forrest, Westley Weimer","doi":"10.1109/ICSE.2012.6227211","DOIUrl":null,"url":null,"abstract":"There are more bugs in real-world programs than human programmers can realistically address. This paper evaluates two research questions: “What fraction of bugs can be repaired automatically?” and “How much does it cost to repair a bug automatically?” In previous work, we presented GenProg, which uses genetic programming to repair defects in off-the-shelf C programs. To answer these questions, we: (1) propose novel algorithmic improvements to GenProg that allow it to scale to large programs and find repairs 68% more often, (2) exploit GenProg's inherent parallelism using cloud computing resources to provide grounded, human-competitive cost measurements, and (3) generate a large, indicative benchmark set to use for systematic evaluations. We evaluate GenProg on 105 defects from 8 open-source programs totaling 5.1 million lines of code and involving 10,193 test cases. GenProg automatically repairs 55 of those 105 defects. To our knowledge, this evaluation is the largest available of its kind, and is often two orders of magnitude larger than previous work in terms of code or test suite size or defect count. Public cloud computing prices allow our 105 runs to be reproduced for $403; a successful repair completes in 96 minutes and costs $7.32, on average.","PeriodicalId":420187,"journal":{"name":"2012 34th International Conference on Software Engineering (ICSE)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"587","resultStr":"{\"title\":\"A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each\",\"authors\":\"Claire Le Goues, Michael Dewey-Vogt, S. Forrest, Westley Weimer\",\"doi\":\"10.1109/ICSE.2012.6227211\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"There are more bugs in real-world programs than human programmers can realistically address. This paper evaluates two research questions: “What fraction of bugs can be repaired automatically?” and “How much does it cost to repair a bug automatically?” In previous work, we presented GenProg, which uses genetic programming to repair defects in off-the-shelf C programs. To answer these questions, we: (1) propose novel algorithmic improvements to GenProg that allow it to scale to large programs and find repairs 68% more often, (2) exploit GenProg's inherent parallelism using cloud computing resources to provide grounded, human-competitive cost measurements, and (3) generate a large, indicative benchmark set to use for systematic evaluations. We evaluate GenProg on 105 defects from 8 open-source programs totaling 5.1 million lines of code and involving 10,193 test cases. GenProg automatically repairs 55 of those 105 defects. To our knowledge, this evaluation is the largest available of its kind, and is often two orders of magnitude larger than previous work in terms of code or test suite size or defect count. Public cloud computing prices allow our 105 runs to be reproduced for $403; a successful repair completes in 96 minutes and costs $7.32, on average.\",\"PeriodicalId\":420187,\"journal\":{\"name\":\"2012 34th International Conference on Software Engineering (ICSE)\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-06-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"587\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 34th International Conference on Software Engineering (ICSE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSE.2012.6227211\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 34th International Conference on Software Engineering (ICSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSE.2012.6227211","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 587

摘要

在现实世界的程序中,bug比人类程序员实际能够解决的要多。本文评估了两个研究问题:“有多少bug可以自动修复?”和“自动修复一个bug要花多少钱?”在之前的工作中,我们介绍了GenProg,它使用遗传编程来修复现成的C程序中的缺陷。为了回答这些问题,我们:(1)对GenProg提出了新的算法改进,使其能够扩展到大型项目,并将修复频率提高68%;(2)利用GenProg固有的并行性,利用云计算资源提供可靠的、与人类竞争的成本测量;(3)生成一个大型的、指示性的基准集,用于系统评估。我们对GenProg的105个缺陷进行了评估,这些缺陷来自8个开源程序,总计510万行代码,涉及10,193个测试用例。GenProg自动修复这105个缺陷中的55个。据我们所知,此评估是同类中最大的可用评估,并且在代码或测试套件大小或缺陷数量方面通常比以前的工作大两个数量级。公共云计算的价格允许我们以403美元的价格复制105次运行;一次成功的修复只需96分钟,平均花费7.32美元。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each
There are more bugs in real-world programs than human programmers can realistically address. This paper evaluates two research questions: “What fraction of bugs can be repaired automatically?” and “How much does it cost to repair a bug automatically?” In previous work, we presented GenProg, which uses genetic programming to repair defects in off-the-shelf C programs. To answer these questions, we: (1) propose novel algorithmic improvements to GenProg that allow it to scale to large programs and find repairs 68% more often, (2) exploit GenProg's inherent parallelism using cloud computing resources to provide grounded, human-competitive cost measurements, and (3) generate a large, indicative benchmark set to use for systematic evaluations. We evaluate GenProg on 105 defects from 8 open-source programs totaling 5.1 million lines of code and involving 10,193 test cases. GenProg automatically repairs 55 of those 105 defects. To our knowledge, this evaluation is the largest available of its kind, and is often two orders of magnitude larger than previous work in terms of code or test suite size or defect count. Public cloud computing prices allow our 105 runs to be reproduced for $403; a successful repair completes in 96 minutes and costs $7.32, on average.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信