Lukás Jendele, Markus Schwenk, Diana Cremarenco, I. Janicijevic, M. Rybalkin
{"title":"大规模构建目标的高效自动分解","authors":"Lukás Jendele, Markus Schwenk, Diana Cremarenco, I. Janicijevic, M. Rybalkin","doi":"10.1109/ICST.2019.00055","DOIUrl":null,"url":null,"abstract":"Large monolithic codebases, such as those used at Google and Facebook, enable engineers to easily share code and allow cross-team collaboration. Such codebases are partitioned into a huge number of libraries, binaries, and tests. However, engineers currently usually have to state the build dependencies between those blocks of functionality manually. One of the possible inefficiencies introduced that way are underutilized libraries, i.e. libraries that provide more functionality than required by the dependent code. This results in slow builds and an increased load on the Continuous Integration System. In this paper, we propose a way to automatically find and decompose underutilized libraries into a set of smaller components, where each component is a standalone library. Our work focuses on decompositions at source file level. While prior work already proposed decompositions when the final number of components was given as an input, we introduce an algorithm, AutoDecomposer, that finds the number of components automatically. In contrast to existing work, we analyze how a decomposition would lower the number of tests triggered by the Continuous Integration System in order to select only those decompositions that provide an impact. We evaluate AutoDecomposer's efficiency by comparing its potential impact to the maximum theoretical impact achievable by applying the most granular decomposition. We conclude that applying AutoDecomposer's decompositions generates 95% of the theoretical maximum test triggering frequency reduction, while only generating 4% as many components for large targets and 30% as many components on average compared to the theoretically most efficient approach.","PeriodicalId":446827,"journal":{"name":"2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Efficient Automated Decomposition of Build Targets at Large-Scale\",\"authors\":\"Lukás Jendele, Markus Schwenk, Diana Cremarenco, I. Janicijevic, M. Rybalkin\",\"doi\":\"10.1109/ICST.2019.00055\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Large monolithic codebases, such as those used at Google and Facebook, enable engineers to easily share code and allow cross-team collaboration. Such codebases are partitioned into a huge number of libraries, binaries, and tests. However, engineers currently usually have to state the build dependencies between those blocks of functionality manually. One of the possible inefficiencies introduced that way are underutilized libraries, i.e. libraries that provide more functionality than required by the dependent code. This results in slow builds and an increased load on the Continuous Integration System. In this paper, we propose a way to automatically find and decompose underutilized libraries into a set of smaller components, where each component is a standalone library. Our work focuses on decompositions at source file level. While prior work already proposed decompositions when the final number of components was given as an input, we introduce an algorithm, AutoDecomposer, that finds the number of components automatically. In contrast to existing work, we analyze how a decomposition would lower the number of tests triggered by the Continuous Integration System in order to select only those decompositions that provide an impact. We evaluate AutoDecomposer's efficiency by comparing its potential impact to the maximum theoretical impact achievable by applying the most granular decomposition. We conclude that applying AutoDecomposer's decompositions generates 95% of the theoretical maximum test triggering frequency reduction, while only generating 4% as many components for large targets and 30% as many components on average compared to the theoretically most efficient approach.\",\"PeriodicalId\":446827,\"journal\":{\"name\":\"2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICST.2019.00055\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICST.2019.00055","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Efficient Automated Decomposition of Build Targets at Large-Scale
Large monolithic codebases, such as those used at Google and Facebook, enable engineers to easily share code and allow cross-team collaboration. Such codebases are partitioned into a huge number of libraries, binaries, and tests. However, engineers currently usually have to state the build dependencies between those blocks of functionality manually. One of the possible inefficiencies introduced that way are underutilized libraries, i.e. libraries that provide more functionality than required by the dependent code. This results in slow builds and an increased load on the Continuous Integration System. In this paper, we propose a way to automatically find and decompose underutilized libraries into a set of smaller components, where each component is a standalone library. Our work focuses on decompositions at source file level. While prior work already proposed decompositions when the final number of components was given as an input, we introduce an algorithm, AutoDecomposer, that finds the number of components automatically. In contrast to existing work, we analyze how a decomposition would lower the number of tests triggered by the Continuous Integration System in order to select only those decompositions that provide an impact. We evaluate AutoDecomposer's efficiency by comparing its potential impact to the maximum theoretical impact achievable by applying the most granular decomposition. We conclude that applying AutoDecomposer's decompositions generates 95% of the theoretical maximum test triggering frequency reduction, while only generating 4% as many components for large targets and 30% as many components on average compared to the theoretically most efficient approach.