Shipwright: A Human-in-the-Loop System for Dockerfile Repair

Jordan Henkel, Denini Silva, Leopoldo Teixeira, Marcelo d’Amorim, T. Reps
{"title":"Shipwright: A Human-in-the-Loop System for Dockerfile Repair","authors":"Jordan Henkel, Denini Silva, Leopoldo Teixeira, Marcelo d’Amorim, T. Reps","doi":"10.1109/ICSE43902.2021.00106","DOIUrl":null,"url":null,"abstract":"Docker is a tool for lightweight OS-level virtualization. Docker images are created by performing a build, controlled by a source-level artifact called a Dockerfile. We studied Dockerfiles on GitHub, and-to our great surprise-found that over a quarter of the examined Dockerfiles failed to build (and thus to produce images). To address this problem, we propose SHIPWRIGHT, a human-in-the-loop system for finding repairs to broken Dockerfiles. SHIPWRIGHT uses a modified version of the BERT language model to embed build logs and to cluster broken Dockerfiles. Using these clusters and a search-based procedure, we were able to design 13 rules for making automated repairs to Dockerfiles. With the aid of SHIPWRIGHT, we submitted 45 pull requests (with a 42.2% acceptance rate) to GitHub projects with broken Dockerfiles. Furthermore, in a \"time-travel\" analysis of broken Dockerfiles that were later fixed, we found that SHIPWRIGHT proposed repairs that were equivalent to human-authored patches in 22.77% of the cases we studied. Finally, we compared our work with recent, state-of-the-art, static Dockerfile analyses, and found that, while static tools detected possible build-failure-inducing issues in 20.6–33.8% of the files we examined, SHIPWRIGHT was able to detect possible issues in 73.25% of the files and, additionally, provide automated repairs for 18.9% of the files.","PeriodicalId":305167,"journal":{"name":"2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE)","volume":"58 6","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSE43902.2021.00106","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13

Abstract

Docker is a tool for lightweight OS-level virtualization. Docker images are created by performing a build, controlled by a source-level artifact called a Dockerfile. We studied Dockerfiles on GitHub, and-to our great surprise-found that over a quarter of the examined Dockerfiles failed to build (and thus to produce images). To address this problem, we propose SHIPWRIGHT, a human-in-the-loop system for finding repairs to broken Dockerfiles. SHIPWRIGHT uses a modified version of the BERT language model to embed build logs and to cluster broken Dockerfiles. Using these clusters and a search-based procedure, we were able to design 13 rules for making automated repairs to Dockerfiles. With the aid of SHIPWRIGHT, we submitted 45 pull requests (with a 42.2% acceptance rate) to GitHub projects with broken Dockerfiles. Furthermore, in a "time-travel" analysis of broken Dockerfiles that were later fixed, we found that SHIPWRIGHT proposed repairs that were equivalent to human-authored patches in 22.77% of the cases we studied. Finally, we compared our work with recent, state-of-the-art, static Dockerfile analyses, and found that, while static tools detected possible build-failure-inducing issues in 20.6–33.8% of the files we examined, SHIPWRIGHT was able to detect possible issues in 73.25% of the files and, additionally, provide automated repairs for 18.9% of the files.
船匠:用于码头文件修复的人在循环系统
Docker是一个轻量级的操作系统级虚拟化工具。Docker映像是通过执行构建来创建的,由称为Dockerfile的源级工件控制。我们研究了GitHub上的Dockerfiles,并惊讶地发现超过四分之一的Dockerfiles无法构建(因此无法生成映像)。为了解决这个问题,我们提出了SHIPWRIGHT,这是一个人工循环系统,用于查找损坏的Dockerfiles的修复。SHIPWRIGHT使用BERT语言模型的修改版本来嵌入构建日志并对损坏的dockerfile进行集群。使用这些集群和基于搜索的过程,我们能够设计13条规则来自动修复Dockerfiles。在SHIPWRIGHT的帮助下,我们向GitHub项目提交了45个拉请求(42.2%的接受率),其中Dockerfiles损坏。此外,在对后来修复的损坏的Dockerfiles进行“时间旅行”分析时,我们发现,在我们研究的22.77%的案例中,SHIPWRIGHT提出的修复相当于人类编写的补丁。最后,我们将我们的工作与最新的、最先进的静态Dockerfile分析进行了比较,发现虽然静态工具在我们检查的20.6-33.8%的文件中检测到可能导致构建失败的问题,但SHIPWRIGHT能够在73.25%的文件中检测到可能的问题,此外,还为18.9%的文件提供了自动修复。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信