PonD: dynamic creation of HTC pool on demand using a decentralized resource discovery system

IEEE International Symposium on High-Performance Parallel Distributed Computing Pub Date : 2012-06-18 DOI:10.1145/2287076.2287105

Kyungyong Lee, D. Wolinsky, R. Figueiredo

{"title":"PonD: dynamic creation of HTC pool on demand using a decentralized resource discovery system","authors":"Kyungyong Lee, D. Wolinsky, R. Figueiredo","doi":"10.1145/2287076.2287105","DOIUrl":null,"url":null,"abstract":"High Throughput Computing (HTC) platforms aggregate heterogeneous resources to provide vast amounts of computing power over a long period of time. Typical HTC systems, such as Condor and BOINC, rely on central managers for resource discovery and scheduling. While this approach simplifies deployment, it requires careful system configuration and management to ensure high availability and scalability. In this paper, we present a novel approach that integrates a self-organizing P2P overlay for scalable and timely discovery of resources with unmodified client/server job scheduling middleware in order to create HTC virtual resource Pools on Demand (PonD). This approach decouples resource discovery and scheduling from job execution/monitoring - a job submission dynamically generates an HTC platform based upon resources discovered through match-making from a large \"sea\" of resources in the P2P overlay and forms a \"PonD\" capable of leveraging unmodified HTC middleware for job execution and monitoring. We show that job scheduling time of our approach scales with O(log N), where N is the number of resources in a pool, through first-order analytical models and large-scale simulation results. To verify the practicality of PonD, we have implemented a prototype using Condor (called C-PonD), a structured P2P overlay, and a PonD creation module. Experimental results with the prototype in two WAN environments (PlanetLab and the FutureGrid cloud computing testbed) demonstrates the utility of C-PonD as a HTC approach without relying on a central repository for maintaining all resource information. Though the prototype is based on Condor, the decoupled nature of the system components - decentralized resource discovery, PonD creation, job execution/monitoring - is generally applicable to other grid computing middleware systems.","PeriodicalId":330072,"journal":{"name":"IEEE International Symposium on High-Performance Parallel Distributed Computing","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE International Symposium on High-Performance Parallel Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2287076.2287105","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

High Throughput Computing (HTC) platforms aggregate heterogeneous resources to provide vast amounts of computing power over a long period of time. Typical HTC systems, such as Condor and BOINC, rely on central managers for resource discovery and scheduling. While this approach simplifies deployment, it requires careful system configuration and management to ensure high availability and scalability. In this paper, we present a novel approach that integrates a self-organizing P2P overlay for scalable and timely discovery of resources with unmodified client/server job scheduling middleware in order to create HTC virtual resource Pools on Demand (PonD). This approach decouples resource discovery and scheduling from job execution/monitoring - a job submission dynamically generates an HTC platform based upon resources discovered through match-making from a large "sea" of resources in the P2P overlay and forms a "PonD" capable of leveraging unmodified HTC middleware for job execution and monitoring. We show that job scheduling time of our approach scales with O(log N), where N is the number of resources in a pool, through first-order analytical models and large-scale simulation results. To verify the practicality of PonD, we have implemented a prototype using Condor (called C-PonD), a structured P2P overlay, and a PonD creation module. Experimental results with the prototype in two WAN environments (PlanetLab and the FutureGrid cloud computing testbed) demonstrates the utility of C-PonD as a HTC approach without relying on a central repository for maintaining all resource information. Though the prototype is based on Condor, the decoupled nature of the system components - decentralized resource discovery, PonD creation, job execution/monitoring - is generally applicable to other grid computing middleware systems.

查看原文本刊更多论文

池塘:使用分散式资源发现系统，动态创建按需HTC池

高吞吐量计算(High Throughput Computing, HTC)平台可以聚合异构资源，在很长一段时间内提供大量的计算能力。典型的HTC系统，如Condor和BOINC，依靠中央管理器进行资源发现和调度。虽然这种方法简化了部署，但它需要仔细的系统配置和管理，以确保高可用性和可伸缩性。在本文中，我们提出了一种新颖的方法，该方法集成了一个自组织的P2P覆盖层，用于可扩展和及时发现资源，以及未经修改的客户端/服务器作业调度中间件，以创建HTC虚拟资源池(PonD)。这种方法将资源发现和调度从作业执行/监控中解耦——作业提交动态地生成一个HTC平台，该平台基于通过配对从P2P覆盖的大量资源中发现的资源，并形成一个“池塘”，能够利用未经修改的HTC中间件进行作业执行和监控。通过一阶分析模型和大规模仿真结果表明，该方法的作业调度时间尺度为O(log N)，其中N为池中的资源数。为了验证PonD的实用性，我们使用Condor(称为C-PonD)实现了一个原型，一个结构化的P2P覆盖和一个PonD创建模块。在两个WAN环境(PlanetLab和FutureGrid云计算测试平台)中使用原型的实验结果证明了C-PonD作为HTC方法的实用性，而不依赖于中央存储库来维护所有资源信息。尽管原型是基于Condor的，但系统组件的解耦特性——分散的资源发现、PonD创建、作业执行/监控——通常适用于其他网格计算中间件系统。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE International Symposium on High-Performance Parallel Distributed Computing

自引率

0.00%

发文量