Scalable and Resilient Workflow Executions on Production Distributed Computing Infrastructures

J. R. Balderrama, Tram Truong Huu, J. Montagnat
{"title":"Scalable and Resilient Workflow Executions on Production Distributed Computing Infrastructures","authors":"J. R. Balderrama, Tram Truong Huu, J. Montagnat","doi":"10.1109/ISPDC.2012.24","DOIUrl":null,"url":null,"abstract":"In spite of the growing interest for grids and cloud infrastructures among scientific communities and the availability of such facilities at large-scale, achieving high performance in production environments remains challenging due to at least four factors: the low reliability of very large-scale distributed computing infrastructures, the performance overhead induced by shared facilities, the difficulty to obtain fair balance of all user jobs in such an heterogeneous environment, and the complexity of large-scale distributed applications deployment. All together, these difficulties make infrastructure exploitation complex, and often limited to experts. This paper introduces a pragmatic solution to tackle these four issues based on a service-oriented methodology, the reuse of existing middleware services, and the joint exploitation of local and distributed computing resources. Emphasis is put on the integrated environment ease of use. Results on an actual neuroscience application show the impact of the environment setup in terms of reliability and performance. Recommendations and best practices are derived from this experiment.","PeriodicalId":287900,"journal":{"name":"2012 11th International Symposium on Parallel and Distributed Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 11th International Symposium on Parallel and Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPDC.2012.24","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

Abstract

In spite of the growing interest for grids and cloud infrastructures among scientific communities and the availability of such facilities at large-scale, achieving high performance in production environments remains challenging due to at least four factors: the low reliability of very large-scale distributed computing infrastructures, the performance overhead induced by shared facilities, the difficulty to obtain fair balance of all user jobs in such an heterogeneous environment, and the complexity of large-scale distributed applications deployment. All together, these difficulties make infrastructure exploitation complex, and often limited to experts. This paper introduces a pragmatic solution to tackle these four issues based on a service-oriented methodology, the reuse of existing middleware services, and the joint exploitation of local and distributed computing resources. Emphasis is put on the integrated environment ease of use. Results on an actual neuroscience application show the impact of the environment setup in terms of reliability and performance. Recommendations and best practices are derived from this experiment.
生产分布式计算基础设施上的可伸缩和弹性工作流执行
尽管科学界对网格和云基础设施的兴趣越来越大,而且此类设施的大规模可用性也越来越高,但由于至少四个因素,在生产环境中实现高性能仍然具有挑战性:非常大规模的分布式计算基础设施的低可靠性、共享设施带来的性能开销、在这种异构环境中难以获得所有用户作业的公平平衡以及大规模分布式应用程序部署的复杂性。总之,这些困难使基础设施开发变得复杂,而且往往仅限于专家。本文介绍了一种实用的解决方案来解决这四个问题,该解决方案基于面向服务的方法、现有中间件服务的重用以及本地和分布式计算资源的联合利用。重点是集成环境的易用性。在一个实际的神经科学应用中,结果显示了环境设置在可靠性和性能方面的影响。建议和最佳实践来源于该实验。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信