DRLPart:一种用于商品服务器上最高效和鲁棒资源划分的深度强化学习框架

Ruobing Chen, Jinping Wu, Haosen Shi, Yusen Li, Xiaoguang Liu, Gang Wang
{"title":"DRLPart:一种用于商品服务器上最高效和鲁棒资源划分的深度强化学习框架","authors":"Ruobing Chen, Jinping Wu, Haosen Shi, Yusen Li, Xiaoguang Liu, Gang Wang","doi":"10.1145/3431379.3460648","DOIUrl":null,"url":null,"abstract":"Workload consolidation is a commonly used approach for improving resource utilization of commodity servers. However, colocated workloads often suffer from significant performance degradations due to resource contention, which makes resource partitioning an important research problem. Partitioning multiple resources coordinately is particularly challenging due to the complex contention behaviors and huge solution space, which is not well-addressed in the literature. In this paper, we propose a deep reinforcement learning (DRL) framework, named DRLPart, for solving the problem of partitioning multiple resources coordinately. DRLPart learns the optimal partitioning decision from easy-to-collect real-time system state, without need of domain knowledge and handcrafted search heuristics. We solve two critical challenges of applying DRL to the resource partitioning problem. First, we build a deep-learning based performance model, which significantly reduces the training overhead, by estimating the rewards of actions without interacting with real system. Second, we propose a fine-tuning process to improve bad decisions occasionally made by the DRL model, which enhances the adaptivity to new situations. Results from extensive evaluations show that the proposed framework is optimally efficient and robust, which improves the system throughput by 13.3%~18.5 compared to the state-of-the-art baselines.","PeriodicalId":343991,"journal":{"name":"Proceedings of the 30th International Symposium on High-Performance Parallel and Distributed Computing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"DRLPart: A Deep Reinforcement Learning Framework for Optimally Efficient and Robust Resource Partitioning on Commodity Servers\",\"authors\":\"Ruobing Chen, Jinping Wu, Haosen Shi, Yusen Li, Xiaoguang Liu, Gang Wang\",\"doi\":\"10.1145/3431379.3460648\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Workload consolidation is a commonly used approach for improving resource utilization of commodity servers. However, colocated workloads often suffer from significant performance degradations due to resource contention, which makes resource partitioning an important research problem. Partitioning multiple resources coordinately is particularly challenging due to the complex contention behaviors and huge solution space, which is not well-addressed in the literature. In this paper, we propose a deep reinforcement learning (DRL) framework, named DRLPart, for solving the problem of partitioning multiple resources coordinately. DRLPart learns the optimal partitioning decision from easy-to-collect real-time system state, without need of domain knowledge and handcrafted search heuristics. We solve two critical challenges of applying DRL to the resource partitioning problem. First, we build a deep-learning based performance model, which significantly reduces the training overhead, by estimating the rewards of actions without interacting with real system. Second, we propose a fine-tuning process to improve bad decisions occasionally made by the DRL model, which enhances the adaptivity to new situations. Results from extensive evaluations show that the proposed framework is optimally efficient and robust, which improves the system throughput by 13.3%~18.5 compared to the state-of-the-art baselines.\",\"PeriodicalId\":343991,\"journal\":{\"name\":\"Proceedings of the 30th International Symposium on High-Performance Parallel and Distributed Computing\",\"volume\":\"21 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-06-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 30th International Symposium on High-Performance Parallel and Distributed Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3431379.3460648\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 30th International Symposium on High-Performance Parallel and Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3431379.3460648","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

摘要

工作负载整合是提高商品服务器资源利用率的常用方法。然而,由于资源争用,并发工作负载往往会导致性能显著下降,这使得资源分区成为一个重要的研究问题。由于争用行为复杂,求解空间巨大,对多个资源进行协调划分尤其具有挑战性,这在文献中没有得到很好的解决。在本文中,我们提出了一个深度强化学习(DRL)框架,命名为DRLPart,以解决多个资源的协调划分问题。DRLPart从易于收集的实时系统状态中学习最优分区决策,不需要领域知识和手工搜索启发式。我们解决了将DRL应用于资源分区问题的两个关键挑战。首先,我们建立了一个基于深度学习的性能模型,通过在不与真实系统交互的情况下估计动作的奖励,显著降低了训练开销。其次,我们提出了一个微调过程来改善DRL模型偶尔做出的错误决策,增强了对新情况的适应能力。广泛的评估结果表明,所提出的框架具有最佳的效率和鲁棒性,与最先进的基线相比,系统吞吐量提高了13.3%~ 18.5%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
DRLPart: A Deep Reinforcement Learning Framework for Optimally Efficient and Robust Resource Partitioning on Commodity Servers
Workload consolidation is a commonly used approach for improving resource utilization of commodity servers. However, colocated workloads often suffer from significant performance degradations due to resource contention, which makes resource partitioning an important research problem. Partitioning multiple resources coordinately is particularly challenging due to the complex contention behaviors and huge solution space, which is not well-addressed in the literature. In this paper, we propose a deep reinforcement learning (DRL) framework, named DRLPart, for solving the problem of partitioning multiple resources coordinately. DRLPart learns the optimal partitioning decision from easy-to-collect real-time system state, without need of domain knowledge and handcrafted search heuristics. We solve two critical challenges of applying DRL to the resource partitioning problem. First, we build a deep-learning based performance model, which significantly reduces the training overhead, by estimating the rewards of actions without interacting with real system. Second, we propose a fine-tuning process to improve bad decisions occasionally made by the DRL model, which enhances the adaptivity to new situations. Results from extensive evaluations show that the proposed framework is optimally efficient and robust, which improves the system throughput by 13.3%~18.5 compared to the state-of-the-art baselines.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信