Jordan Henkel, Denini Silva, Leopoldo Teixeira, Marcelo d’Amorim, T. Reps
{"title":"船匠:用于码头文件修复的人在循环系统","authors":"Jordan Henkel, Denini Silva, Leopoldo Teixeira, Marcelo d’Amorim, T. Reps","doi":"10.1109/ICSE43902.2021.00106","DOIUrl":null,"url":null,"abstract":"Docker is a tool for lightweight OS-level virtualization. Docker images are created by performing a build, controlled by a source-level artifact called a Dockerfile. We studied Dockerfiles on GitHub, and-to our great surprise-found that over a quarter of the examined Dockerfiles failed to build (and thus to produce images). To address this problem, we propose SHIPWRIGHT, a human-in-the-loop system for finding repairs to broken Dockerfiles. SHIPWRIGHT uses a modified version of the BERT language model to embed build logs and to cluster broken Dockerfiles. Using these clusters and a search-based procedure, we were able to design 13 rules for making automated repairs to Dockerfiles. With the aid of SHIPWRIGHT, we submitted 45 pull requests (with a 42.2% acceptance rate) to GitHub projects with broken Dockerfiles. Furthermore, in a \"time-travel\" analysis of broken Dockerfiles that were later fixed, we found that SHIPWRIGHT proposed repairs that were equivalent to human-authored patches in 22.77% of the cases we studied. Finally, we compared our work with recent, state-of-the-art, static Dockerfile analyses, and found that, while static tools detected possible build-failure-inducing issues in 20.6–33.8% of the files we examined, SHIPWRIGHT was able to detect possible issues in 73.25% of the files and, additionally, provide automated repairs for 18.9% of the files.","PeriodicalId":305167,"journal":{"name":"2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE)","volume":"58 6","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Shipwright: A Human-in-the-Loop System for Dockerfile Repair\",\"authors\":\"Jordan Henkel, Denini Silva, Leopoldo Teixeira, Marcelo d’Amorim, T. Reps\",\"doi\":\"10.1109/ICSE43902.2021.00106\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Docker is a tool for lightweight OS-level virtualization. Docker images are created by performing a build, controlled by a source-level artifact called a Dockerfile. We studied Dockerfiles on GitHub, and-to our great surprise-found that over a quarter of the examined Dockerfiles failed to build (and thus to produce images). To address this problem, we propose SHIPWRIGHT, a human-in-the-loop system for finding repairs to broken Dockerfiles. SHIPWRIGHT uses a modified version of the BERT language model to embed build logs and to cluster broken Dockerfiles. Using these clusters and a search-based procedure, we were able to design 13 rules for making automated repairs to Dockerfiles. With the aid of SHIPWRIGHT, we submitted 45 pull requests (with a 42.2% acceptance rate) to GitHub projects with broken Dockerfiles. Furthermore, in a \\\"time-travel\\\" analysis of broken Dockerfiles that were later fixed, we found that SHIPWRIGHT proposed repairs that were equivalent to human-authored patches in 22.77% of the cases we studied. Finally, we compared our work with recent, state-of-the-art, static Dockerfile analyses, and found that, while static tools detected possible build-failure-inducing issues in 20.6–33.8% of the files we examined, SHIPWRIGHT was able to detect possible issues in 73.25% of the files and, additionally, provide automated repairs for 18.9% of the files.\",\"PeriodicalId\":305167,\"journal\":{\"name\":\"2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE)\",\"volume\":\"58 6\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-02-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSE43902.2021.00106\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSE43902.2021.00106","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Shipwright: A Human-in-the-Loop System for Dockerfile Repair
Docker is a tool for lightweight OS-level virtualization. Docker images are created by performing a build, controlled by a source-level artifact called a Dockerfile. We studied Dockerfiles on GitHub, and-to our great surprise-found that over a quarter of the examined Dockerfiles failed to build (and thus to produce images). To address this problem, we propose SHIPWRIGHT, a human-in-the-loop system for finding repairs to broken Dockerfiles. SHIPWRIGHT uses a modified version of the BERT language model to embed build logs and to cluster broken Dockerfiles. Using these clusters and a search-based procedure, we were able to design 13 rules for making automated repairs to Dockerfiles. With the aid of SHIPWRIGHT, we submitted 45 pull requests (with a 42.2% acceptance rate) to GitHub projects with broken Dockerfiles. Furthermore, in a "time-travel" analysis of broken Dockerfiles that were later fixed, we found that SHIPWRIGHT proposed repairs that were equivalent to human-authored patches in 22.77% of the cases we studied. Finally, we compared our work with recent, state-of-the-art, static Dockerfile analyses, and found that, while static tools detected possible build-failure-inducing issues in 20.6–33.8% of the files we examined, SHIPWRIGHT was able to detect possible issues in 73.25% of the files and, additionally, provide automated repairs for 18.9% of the files.