云上零星操作的运行时恢复操作选择

Min Fu, Liming Zhu, Daniel W. Sun, Anna Liu, L. Bass, Q. Lu
{"title":"云上零星操作的运行时恢复操作选择","authors":"Min Fu, Liming Zhu, Daniel W. Sun, Anna Liu, L. Bass, Q. Lu","doi":"10.1109/ASWEC.2015.33","DOIUrl":null,"url":null,"abstract":"Sporadic operations such as rolling upgrade or machine instance redeployment are prone to unpredictable failures in the cloud largely due to the inherent high variability nature of cloud. Previous dependability research has established several recovery methods for cloud failures. In this paper, we first propose eight recovery patterns for sporadic operations. We then present the filtering process which filters applicable recovery patterns for a given operational step. We also propose a methodology to evaluate the recovery actions generated for the applicable recovery patterns based on the metrics of Recovery Time, Recovery Cost and Recovery Impact. This quantitative evaluation will lead to selection of optimal recovery actions. We implement a recovery service and illustrate its applicability by recovering from errors occurring in Asgard rolling upgrade operation on cloud. The experimental results show that the recovery service enhances automated recovery from operational failures by selecting the optimal recovery actions.","PeriodicalId":310799,"journal":{"name":"2015 24th Australasian Software Engineering Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Runtime Recovery Actions Selection for Sporadic Operations on Cloud\",\"authors\":\"Min Fu, Liming Zhu, Daniel W. Sun, Anna Liu, L. Bass, Q. Lu\",\"doi\":\"10.1109/ASWEC.2015.33\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sporadic operations such as rolling upgrade or machine instance redeployment are prone to unpredictable failures in the cloud largely due to the inherent high variability nature of cloud. Previous dependability research has established several recovery methods for cloud failures. In this paper, we first propose eight recovery patterns for sporadic operations. We then present the filtering process which filters applicable recovery patterns for a given operational step. We also propose a methodology to evaluate the recovery actions generated for the applicable recovery patterns based on the metrics of Recovery Time, Recovery Cost and Recovery Impact. This quantitative evaluation will lead to selection of optimal recovery actions. We implement a recovery service and illustrate its applicability by recovering from errors occurring in Asgard rolling upgrade operation on cloud. The experimental results show that the recovery service enhances automated recovery from operational failures by selecting the optimal recovery actions.\",\"PeriodicalId\":310799,\"journal\":{\"name\":\"2015 24th Australasian Software Engineering Conference\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-09-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 24th Australasian Software Engineering Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASWEC.2015.33\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 24th Australasian Software Engineering Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASWEC.2015.33","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

诸如滚动升级或机器实例重新部署之类的零星操作很容易在云中出现不可预测的故障,这主要是由于云固有的高可变性。以前的可靠性研究已经建立了几种云故障的恢复方法。本文首先提出了零星作业的八种恢复模式。然后,我们介绍了过滤过程,它为给定的操作步骤过滤适用的恢复模式。我们还提出了一种基于恢复时间、恢复成本和恢复影响指标的方法来评估为适用的恢复模式产生的恢复行动。这种定量评价将有助于选择最佳的恢复措施。我们实现了一个恢复服务,并通过在云上恢复Asgard滚动升级操作中出现的错误来说明其适用性。实验结果表明,通过选择最优恢复动作,恢复服务增强了操作故障的自动恢复能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Runtime Recovery Actions Selection for Sporadic Operations on Cloud
Sporadic operations such as rolling upgrade or machine instance redeployment are prone to unpredictable failures in the cloud largely due to the inherent high variability nature of cloud. Previous dependability research has established several recovery methods for cloud failures. In this paper, we first propose eight recovery patterns for sporadic operations. We then present the filtering process which filters applicable recovery patterns for a given operational step. We also propose a methodology to evaluate the recovery actions generated for the applicable recovery patterns based on the metrics of Recovery Time, Recovery Cost and Recovery Impact. This quantitative evaluation will lead to selection of optimal recovery actions. We implement a recovery service and illustrate its applicability by recovering from errors occurring in Asgard rolling upgrade operation on cloud. The experimental results show that the recovery service enhances automated recovery from operational failures by selecting the optimal recovery actions.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信