A Robust Communication Framework for Parallel Execution on Volunteer PC Grids

Eshwar Rohit, Hien Nguyen, N. Kanna, J. Subhlok, E. Gabriel, Qian Wang, M. Cheung, David P. Anderson
{"title":"A Robust Communication Framework for Parallel Execution on Volunteer PC Grids","authors":"Eshwar Rohit, Hien Nguyen, N. Kanna, J. Subhlok, E. Gabriel, Qian Wang, M. Cheung, David P. Anderson","doi":"10.1109/CCGrid.2011.72","DOIUrl":null,"url":null,"abstract":"Volunteer PC grids represent massive computation capacity at a low cost, but are challenging to employ for parallel computing because of variable and unpredictable performance and availability. A communicating parallel program must employ explicit redundancy, or implicit redundancy with uncoordinated checkpoint-restart to make continuous forward progress in such an unreliable environment. A communication model based on one-sided Put/Get calls to an abstract global shared space is a good match as processes can execute their communication operations independently and asynchronously. However, no existing system is designed for redundant communicating processes. The key problem is that a single logical operation that impacts the global program state may be executed by different instances of the same process at different times leading to semantic inconsistency. This paper presents the design, execution model, implementation, and usage of {\\em Volpex}, a communication layer for robust execution on volunteer PC grids. The research leads to a practical way to employ idle PCs for latency tolerant parallel computing applications.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"101 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCGrid.2011.72","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

Abstract

Volunteer PC grids represent massive computation capacity at a low cost, but are challenging to employ for parallel computing because of variable and unpredictable performance and availability. A communicating parallel program must employ explicit redundancy, or implicit redundancy with uncoordinated checkpoint-restart to make continuous forward progress in such an unreliable environment. A communication model based on one-sided Put/Get calls to an abstract global shared space is a good match as processes can execute their communication operations independently and asynchronously. However, no existing system is designed for redundant communicating processes. The key problem is that a single logical operation that impacts the global program state may be executed by different instances of the same process at different times leading to semantic inconsistency. This paper presents the design, execution model, implementation, and usage of {\em Volpex}, a communication layer for robust execution on volunteer PC grids. The research leads to a practical way to employ idle PCs for latency tolerant parallel computing applications.
志愿PC网格并行执行的鲁棒通信框架
志愿PC网格以低成本代表了巨大的计算能力,但由于性能和可用性的变化和不可预测,对并行计算的应用具有挑战性。在这种不可靠的环境中,通信并行程序必须采用显式冗余或带有不协调的检查点重新启动的隐式冗余来实现连续的向前进展。基于对抽象全局共享空间的片面Put/Get调用的通信模型是一个很好的匹配,因为进程可以独立且异步地执行它们的通信操作。然而,没有现有的系统是为冗余通信过程设计的。关键问题是,影响全局程序状态的单个逻辑操作可能由同一进程的不同实例在不同时间执行,从而导致语义不一致。本文介绍了{\em Volpex}的设计、执行模型、实现和使用,这是一个在志愿PC网格上健壮执行的通信层。该研究为利用空闲pc进行容错并行计算应用提供了一种实用的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信