Supporting dynamic space-sharing on clusters of non-dedicated workstations

Abdur Chowdhury, Lisa D. Nicklas, Sanjeev Setia, E. White
{"title":"Supporting dynamic space-sharing on clusters of non-dedicated workstations","authors":"Abdur Chowdhury, Lisa D. Nicklas, Sanjeev Setia, E. White","doi":"10.1109/ICDCS.1997.597902","DOIUrl":null,"url":null,"abstract":"Clusters of workstations are increasingly being viewed as a cost effective alternative to parallel supercomputers. However, resource management and scheduling on workstations clusters is complicated by the fact that the number of idle workstations available for executing parallel applications is constantly fluctuating. We present a case for scheduling parallel applications on non dedicated workstation clusters using dynamic space sharing, a policy under which the number of processors allocated to an application can be changed during its execution. We describe an approach that uses application level checkpointing and data repartitioning for supporting dynamic space sharing and for handling the dynamic reconfiguration triggered when failure or owner activity is detected on a workstation being used by a parallel application. The performance advantages of dynamic space sharing are quantified through a simulation study, and experimental results are presented for the overhead of dynamic reconfiguration of a grid oriented data parallel application using our approach.","PeriodicalId":122990,"journal":{"name":"Proceedings of 17th International Conference on Distributed Computing Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1997-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of 17th International Conference on Distributed Computing Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDCS.1997.597902","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 18

Abstract

Clusters of workstations are increasingly being viewed as a cost effective alternative to parallel supercomputers. However, resource management and scheduling on workstations clusters is complicated by the fact that the number of idle workstations available for executing parallel applications is constantly fluctuating. We present a case for scheduling parallel applications on non dedicated workstation clusters using dynamic space sharing, a policy under which the number of processors allocated to an application can be changed during its execution. We describe an approach that uses application level checkpointing and data repartitioning for supporting dynamic space sharing and for handling the dynamic reconfiguration triggered when failure or owner activity is detected on a workstation being used by a parallel application. The performance advantages of dynamic space sharing are quantified through a simulation study, and experimental results are presented for the overhead of dynamic reconfiguration of a grid oriented data parallel application using our approach.
支持非专用工作站集群上的动态空间共享
工作站集群正日益被视为替代并行超级计算机的一种经济有效的选择。然而,由于可用于执行并行应用程序的空闲工作站数量不断波动,工作站集群上的资源管理和调度变得复杂。我们提出了一个使用动态空间共享在非专用工作站集群上调度并行应用程序的案例,在该策略下,分配给应用程序的处理器数量可以在其执行期间更改。我们描述了一种方法,该方法使用应用程序级检查点和数据重分区来支持动态空间共享,并处理在并行应用程序使用的工作站上检测到故障或所有者活动时触发的动态重新配置。通过仿真研究,量化了动态空间共享的性能优势,并给出了使用该方法实现面向网格的数据并行应用动态重构开销的实验结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信