Phoenix: making data-intensive grid applications fault-tolerant

George Kola, T. Kosar, M. Livny
{"title":"Phoenix: making data-intensive grid applications fault-tolerant","authors":"George Kola, T. Kosar, M. Livny","doi":"10.1109/GRID.2004.51","DOIUrl":null,"url":null,"abstract":"A major hurdle facing data intensive grid applications is the appropriate handling of failures that occur in the grid-environment. Implementing the fault-tolerance transparently at the grid-middleware level would make different data intensive applications fault-tolerant without each having to pay a separate cost and reduce the time to grid-based solution for many scientific problems. We analyzed the failures encountered by four real-life production data intensive applications: NCSA image processing pipeline, WCER video processing pipeline, US-CMS pipeline and BMRB BLAST pipeline. Taking the result of the analysis into account, we have designed and implemented Phoenix, a transparent middleware-level fault-tolerance layer that detects failures early, classifies failures into transient and permanent and appropriately handIes the transient failures. We applied our fault-tolerance layer to a prototype of the NCSA image processing pipeline and considerably improved the failure handling and report on the insights gained in the process.","PeriodicalId":335281,"journal":{"name":"Fifth IEEE/ACM International Workshop on Grid Computing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"29","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fifth IEEE/ACM International Workshop on Grid Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GRID.2004.51","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 29

Abstract

A major hurdle facing data intensive grid applications is the appropriate handling of failures that occur in the grid-environment. Implementing the fault-tolerance transparently at the grid-middleware level would make different data intensive applications fault-tolerant without each having to pay a separate cost and reduce the time to grid-based solution for many scientific problems. We analyzed the failures encountered by four real-life production data intensive applications: NCSA image processing pipeline, WCER video processing pipeline, US-CMS pipeline and BMRB BLAST pipeline. Taking the result of the analysis into account, we have designed and implemented Phoenix, a transparent middleware-level fault-tolerance layer that detects failures early, classifies failures into transient and permanent and appropriately handIes the transient failures. We applied our fault-tolerance layer to a prototype of the NCSA image processing pipeline and considerably improved the failure handling and report on the insights gained in the process.
Phoenix:使数据密集型网格应用程序具有容错性
数据密集型网格应用程序面临的一个主要障碍是正确处理网格环境中发生的故障。在网格中间件级别透明地实现容错将使不同的数据密集型应用程序具有容错能力,而无需每个应用程序支付单独的成本,并且减少了许多科学问题的基于网格的解决方案的时间。我们分析了NCSA图像处理管道、WCER视频处理管道、US-CMS管道和BMRB BLAST管道四个实际生产数据密集型应用中遇到的故障。考虑到分析的结果,我们设计并实现了Phoenix,这是一个透明的中间件级容错层,可以早期检测故障,将故障分为瞬态故障和永久故障,并适当地处理瞬态故障。我们将我们的容错层应用于NCSA图像处理管道的原型,并大大改进了故障处理和报告过程中获得的见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信