Houkun Zhu, Dominik Scheinert, L. Thamsen, Kordian Gontarska, O. Kao
{"title":"喜鹊:使用深度强化学习自动调整分布式文件系统的静态参数","authors":"Houkun Zhu, Dominik Scheinert, L. Thamsen, Kordian Gontarska, O. Kao","doi":"10.1109/IC2E55432.2022.00023","DOIUrl":null,"url":null,"abstract":"Distributed file systems are widely used nowadays, yet using their default configurations is often not optimal. At the same time, tuning configuration parameters is typically challenging and time-consuming. It demands expertise and tuning operations can also be expensive. This is especially the case for static parameters, where changes take effect only after a restart of the system or workloads. We propose a novel approach, Magpie, which utilizes deep re-inforcement learning to tune static parameters by strategically ex-ploring and exploiting configuration parameter spaces. To boost the tuning of the static parameters, our method employs both server and client metrics of distributed file systems to understand the relationship between static parameters and performance. Our empirical evaluation results show that Magpie can noticeably improve the performance of the distributed file system Lustre, where our approach on average achieves 91.8 % throughput gains against default configuration after tuning towards single performance indicator optimization, while it reaches 39.7% more throughput gains against the baseline.","PeriodicalId":415781,"journal":{"name":"2022 IEEE International Conference on Cloud Engineering (IC2E)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Magpie: Automatically Tuning Static Parameters for Distributed File Systems using Deep Reinforcement Learning\",\"authors\":\"Houkun Zhu, Dominik Scheinert, L. Thamsen, Kordian Gontarska, O. Kao\",\"doi\":\"10.1109/IC2E55432.2022.00023\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Distributed file systems are widely used nowadays, yet using their default configurations is often not optimal. At the same time, tuning configuration parameters is typically challenging and time-consuming. It demands expertise and tuning operations can also be expensive. This is especially the case for static parameters, where changes take effect only after a restart of the system or workloads. We propose a novel approach, Magpie, which utilizes deep re-inforcement learning to tune static parameters by strategically ex-ploring and exploiting configuration parameter spaces. To boost the tuning of the static parameters, our method employs both server and client metrics of distributed file systems to understand the relationship between static parameters and performance. Our empirical evaluation results show that Magpie can noticeably improve the performance of the distributed file system Lustre, where our approach on average achieves 91.8 % throughput gains against default configuration after tuning towards single performance indicator optimization, while it reaches 39.7% more throughput gains against the baseline.\",\"PeriodicalId\":415781,\"journal\":{\"name\":\"2022 IEEE International Conference on Cloud Engineering (IC2E)\",\"volume\":\"37 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference on Cloud Engineering (IC2E)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IC2E55432.2022.00023\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Cloud Engineering (IC2E)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IC2E55432.2022.00023","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Magpie: Automatically Tuning Static Parameters for Distributed File Systems using Deep Reinforcement Learning
Distributed file systems are widely used nowadays, yet using their default configurations is often not optimal. At the same time, tuning configuration parameters is typically challenging and time-consuming. It demands expertise and tuning operations can also be expensive. This is especially the case for static parameters, where changes take effect only after a restart of the system or workloads. We propose a novel approach, Magpie, which utilizes deep re-inforcement learning to tune static parameters by strategically ex-ploring and exploiting configuration parameter spaces. To boost the tuning of the static parameters, our method employs both server and client metrics of distributed file systems to understand the relationship between static parameters and performance. Our empirical evaluation results show that Magpie can noticeably improve the performance of the distributed file system Lustre, where our approach on average achieves 91.8 % throughput gains against default configuration after tuning towards single performance indicator optimization, while it reaches 39.7% more throughput gains against the baseline.