Quantitative Availability Analysis of Hierarchical Datacenter under Power Oversubscription

Zhiguang Tang, Haihang Zhou, Yujin Zhu, Run Tian, Jianguo Yao
{"title":"Quantitative Availability Analysis of Hierarchical Datacenter under Power Oversubscription","authors":"Zhiguang Tang, Haihang Zhou, Yujin Zhu, Run Tian, Jianguo Yao","doi":"10.1109/SMARTCOMP.2017.7947039","DOIUrl":null,"url":null,"abstract":"From the perspective of economic and efficient benefits, modern data center oversubscribes power supplies to deploy as many servers as possible. The oversubscription is based on the varied loads among servers to modulate power demand. Nevertheless, power oversubscription has potential threats to system availability, the data center may collapse as a result of overloading. Current solutions to the oversubscription usually focus on managing the datacenter workload to avoid the peak power demand time in the data center. However, none of the current research considers the influence of the failure of the power or utility components, where the component failure may affect the effectiveness of these strategies. Meanwhile, none of these current research can answer the question that how many servers should be deployed in the data centers under an availability constraint. In this paper, we propose quantitative availability analysis of hierarchical datacenter under power oversubscription. To this end, we use Markov chain and Stochastic Reward Net (SRN) to model the failure and repair processes of data center components. The servers at the bottom level are distributed in two pools: main pool and backup pool, where running servers are in main pool and turned-off servers in backup pool. Migration from backup pool to main pool is conducted once any running server fails. SRNs are implemented to model these two pools, and Markov chain is used to model the components in the upper level. The evaluation is based on the real-life Google and Wikipedia traces. The result shows the relationship between oversubscription and data center availability, which can guide the data center operators to choose the appropriate oversubscription ratio under the availability constraint.","PeriodicalId":193593,"journal":{"name":"2017 IEEE International Conference on Smart Computing (SMARTCOMP)","volume":"96 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Conference on Smart Computing (SMARTCOMP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SMARTCOMP.2017.7947039","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

From the perspective of economic and efficient benefits, modern data center oversubscribes power supplies to deploy as many servers as possible. The oversubscription is based on the varied loads among servers to modulate power demand. Nevertheless, power oversubscription has potential threats to system availability, the data center may collapse as a result of overloading. Current solutions to the oversubscription usually focus on managing the datacenter workload to avoid the peak power demand time in the data center. However, none of the current research considers the influence of the failure of the power or utility components, where the component failure may affect the effectiveness of these strategies. Meanwhile, none of these current research can answer the question that how many servers should be deployed in the data centers under an availability constraint. In this paper, we propose quantitative availability analysis of hierarchical datacenter under power oversubscription. To this end, we use Markov chain and Stochastic Reward Net (SRN) to model the failure and repair processes of data center components. The servers at the bottom level are distributed in two pools: main pool and backup pool, where running servers are in main pool and turned-off servers in backup pool. Migration from backup pool to main pool is conducted once any running server fails. SRNs are implemented to model these two pools, and Markov chain is used to model the components in the upper level. The evaluation is based on the real-life Google and Wikipedia traces. The result shows the relationship between oversubscription and data center availability, which can guide the data center operators to choose the appropriate oversubscription ratio under the availability constraint.
电力超限下分层数据中心可用性定量分析
从经济效益和效率效益的角度来看,现代数据中心通过超额订购电源来部署尽可能多的服务器。超额订阅是基于服务器之间的不同负载来调节电力需求的。然而,电力超限对系统可用性有潜在的威胁,数据中心可能会因为过载而崩溃。当前针对超额订阅的解决方案通常侧重于管理数据中心工作负载,以避免数据中心的峰值电力需求时间。然而,目前的研究都没有考虑到电力或公用事业组件故障的影响,其中组件故障可能会影响这些策略的有效性。同时,目前的这些研究都不能回答在可用性限制下数据中心应该部署多少服务器的问题。本文提出了电力超限下分层数据中心的可用性定量分析方法。为此,我们使用马尔可夫链和随机奖励网(SRN)对数据中心部件的故障和修复过程进行建模。底层服务器分布在主池和备份池两个池中,运行的服务器在主池中,关闭的服务器在备份池中。当任何正在运行的服务器出现故障时,执行从备份池到主池的迁移。实现srn对这两个池进行建模,并使用马尔可夫链对上层组件进行建模。评估是基于现实生活中的谷歌和维基百科的痕迹。结果显示了超额认购与数据中心可用性之间的关系,可以指导数据中心运营商在可用性约束下选择合适的超额认购比例。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信