Recursive Evaluation of Fault Tolerance Mechanisms for SLA Management

K. Voß
{"title":"Recursive Evaluation of Fault Tolerance Mechanisms for SLA Management","authors":"K. Voß","doi":"10.1109/ICNS.2008.22","DOIUrl":null,"url":null,"abstract":"Service level agreements (SLAs) have been introduced into the grid in order to build a basis for its commercial uptake. The challenge for Grid providers in agreeing and operating SLA-bound jobs is to ensure their fulfillment even in the case of failures. Hence, fault-tolerance mechanisms are an essential means of the provider's SLA management. The high utilization of commercial operated clusters leads to scenarios in which typically a job migration effects other jobs scheduled. The effects result from the unavailability of enough free resources which would be needed to catch all resource outages. Consequently before initiating a migration, its effects for other jobs have to be compared and the initiation of fault- tolerance (FT-) mechanisms have to be evaluated recursively. This paper presents a measurement for the benefit of initiating a FT-mechanism, the recursive evaluation, and termination condition. Performing such an impact evaluation of an initiated chain of FT-mechanisms is often more profitable than performing a single FT-mechanism and accordingly this is important for the Grid commercialization.","PeriodicalId":180899,"journal":{"name":"Fourth International Conference on Networking and Services (icns 2008)","volume":"121 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fourth International Conference on Networking and Services (icns 2008)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNS.2008.22","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Service level agreements (SLAs) have been introduced into the grid in order to build a basis for its commercial uptake. The challenge for Grid providers in agreeing and operating SLA-bound jobs is to ensure their fulfillment even in the case of failures. Hence, fault-tolerance mechanisms are an essential means of the provider's SLA management. The high utilization of commercial operated clusters leads to scenarios in which typically a job migration effects other jobs scheduled. The effects result from the unavailability of enough free resources which would be needed to catch all resource outages. Consequently before initiating a migration, its effects for other jobs have to be compared and the initiation of fault- tolerance (FT-) mechanisms have to be evaluated recursively. This paper presents a measurement for the benefit of initiating a FT-mechanism, the recursive evaluation, and termination condition. Performing such an impact evaluation of an initiated chain of FT-mechanisms is often more profitable than performing a single FT-mechanism and accordingly this is important for the Grid commercialization.
SLA管理容错机制的递归评估
服务水平协议(sla)已经被引入到网格中,以便为其商业应用建立基础。网格提供者在同意和操作sla绑定作业时面临的挑战是,即使在出现故障的情况下,也要确保它们的实现。因此,容错机制是提供商SLA管理的基本手段。商业操作集群的高利用率通常会导致作业迁移影响调度的其他作业的情况。这种影响是由于没有足够的空闲资源来捕获所有资源中断。因此,在开始迁移之前,必须比较其对其他作业的影响,并且必须递归地评估容错(FT-)机制的启动。本文提出了启动傅里叶变换机制的效益测量、递归评估和终止条件。对启动的ft机制链进行这样的影响评估通常比执行单个ft机制更有利可图,因此这对网格商业化很重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信