确定数据中心可靠节能运行的最佳服务器数量的随机方法

IF 3 3区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Kazi Main Uddin Ahmed;Math H. J. Bollen;Manuel Alvarez
{"title":"确定数据中心可靠节能运行的最佳服务器数量的随机方法","authors":"Kazi Main Uddin Ahmed;Math H. J. Bollen;Manuel Alvarez","doi":"10.1109/TSUSC.2022.3216350","DOIUrl":null,"url":null,"abstract":"The increasing demand of the data center's computational capacity in recent years has introduced new data center operational challenges among others to maintain the service level agreements (SLA) and quality of services (QoS), while at the same time limiting energy consumption. In this paper, a stochastic operational risk assessment approach is presented that estimates the required number of spare servers in a data center considering the risk of servers’ failure in operation since servers define the computational capability of a data center. A reliability index called “risk of computational resource commitment (RCRC)” is introduced that quantifies the probability of having insufficient spare servers due to failures during the operational lead time, and the complement of the RCRC shows the ability of the resources to maintain SLA of a data center. The failure rates of the servers are obtained using a Monte Carlo Simulation with the failure data, published by Google in 2019. The analysis shows that the RCRC reduces with the increasing number of spare servers, while it also stresses the energy efficiency of the data center. The RCRC index could be used in data center operation to avoid overprovisioning of the servers and to limit the number of spare servers in the data center, while creating a suitable balance between QoS and energy consumption of the data centers.","PeriodicalId":13268,"journal":{"name":"IEEE Transactions on Sustainable Computing","volume":null,"pages":null},"PeriodicalIF":3.0000,"publicationDate":"2022-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Stochastic Approach to Determine the Optimal Number of Servers for Reliable and Energy Efficient Operation of Data Centers\",\"authors\":\"Kazi Main Uddin Ahmed;Math H. J. Bollen;Manuel Alvarez\",\"doi\":\"10.1109/TSUSC.2022.3216350\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The increasing demand of the data center's computational capacity in recent years has introduced new data center operational challenges among others to maintain the service level agreements (SLA) and quality of services (QoS), while at the same time limiting energy consumption. In this paper, a stochastic operational risk assessment approach is presented that estimates the required number of spare servers in a data center considering the risk of servers’ failure in operation since servers define the computational capability of a data center. A reliability index called “risk of computational resource commitment (RCRC)” is introduced that quantifies the probability of having insufficient spare servers due to failures during the operational lead time, and the complement of the RCRC shows the ability of the resources to maintain SLA of a data center. The failure rates of the servers are obtained using a Monte Carlo Simulation with the failure data, published by Google in 2019. The analysis shows that the RCRC reduces with the increasing number of spare servers, while it also stresses the energy efficiency of the data center. The RCRC index could be used in data center operation to avoid overprovisioning of the servers and to limit the number of spare servers in the data center, while creating a suitable balance between QoS and energy consumption of the data centers.\",\"PeriodicalId\":13268,\"journal\":{\"name\":\"IEEE Transactions on Sustainable Computing\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2022-10-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Sustainable Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/9926075/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Sustainable Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/9926075/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

摘要

近年来,对数据中心计算能力的日益增长的需求带来了新的数据中心运营挑战,其中包括维护服务水平协议(SLA)和服务质量(QoS),同时限制能源消耗。在本文中,由于服务器定义了数据中心的计算能力,因此提出了一种随机操作风险评估方法,该方法在考虑服务器在操作中故障风险的情况下,估计数据中心所需的备用服务器数量。引入了一个名为“计算资源承诺风险(RCRC)”的可靠性指数,该指数量化了在运营交付周期内由于故障导致备用服务器不足的概率,RCRC的补充表明了资源维护数据中心SLA的能力。服务器的故障率是使用蒙特卡洛模拟和谷歌2019年发布的故障数据获得的。分析表明,RCRC随着备用服务器数量的增加而减少,同时也强调了数据中心的能效。RCRC索引可用于数据中心操作,以避免服务器的过度配置,并限制数据中心中备用服务器的数量,同时在数据中心的QoS和能耗之间建立适当的平衡。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Stochastic Approach to Determine the Optimal Number of Servers for Reliable and Energy Efficient Operation of Data Centers
The increasing demand of the data center's computational capacity in recent years has introduced new data center operational challenges among others to maintain the service level agreements (SLA) and quality of services (QoS), while at the same time limiting energy consumption. In this paper, a stochastic operational risk assessment approach is presented that estimates the required number of spare servers in a data center considering the risk of servers’ failure in operation since servers define the computational capability of a data center. A reliability index called “risk of computational resource commitment (RCRC)” is introduced that quantifies the probability of having insufficient spare servers due to failures during the operational lead time, and the complement of the RCRC shows the ability of the resources to maintain SLA of a data center. The failure rates of the servers are obtained using a Monte Carlo Simulation with the failure data, published by Google in 2019. The analysis shows that the RCRC reduces with the increasing number of spare servers, while it also stresses the energy efficiency of the data center. The RCRC index could be used in data center operation to avoid overprovisioning of the servers and to limit the number of spare servers in the data center, while creating a suitable balance between QoS and energy consumption of the data centers.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Transactions on Sustainable Computing
IEEE Transactions on Sustainable Computing Mathematics-Control and Optimization
CiteScore
7.70
自引率
2.60%
发文量
54
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信