SCOUT: A Monitor and Profiler of Grid Resources for Large-Scale Scientific Computing

2015 International Conference on Cloud and Autonomic Computing Pub Date : 2015-09-21 DOI:10.1109/ICCAC.2015.39

Md. Azam Hossain, Hieu Trong Vu, Jik-Soo Kim, Myungho Lee, Soonwook Hwang

{"title":"SCOUT: A Monitor and Profiler of Grid Resources for Large-Scale Scientific Computing","authors":"Md. Azam Hossain, Hieu Trong Vu, Jik-Soo Kim, Myungho Lee, Soonwook Hwang","doi":"10.1109/ICCAC.2015.39","DOIUrl":null,"url":null,"abstract":"Computational Grids consist of heterogeneous collections of geographically distributed computing resources and have supported numerous scientific applications that require substantial amounts of computing power and storage space. From the point of view of scientists who want to leverage these Grid computing resources, effectively locating appropriate computing resources with minimized allocation overheads is crucial to successfully execute large-scale scientific applications. However, Grid resource availability is highly unstable and current Grid Information Service (GIS) does not provide accurate state information of computing resources. This can make it very difficult for users and systems (Schedulers, Resource brokers) to schedule the jobs in the Grid system and to map tasks on appropriate available resources. In this paper, we present SCOUT system that can provide scientific users with current state information about Grid computing resources including the number of available CPU cores and average response time to get resources allocated. With the help of SCOUT, we can periodically profile resource availability of the Computing Elements (CE) in Grids and monitor their average response time and performance. It provides a mechanism to find out the number of available CPU cores required for the applications to execute their tasks within shortest expected time which can accelerate the productivity of leveraging Grid computing resources for solving complex and challenging scientific problems. We have performed resource profiling based on SCOUT system on two different VO(Virtual Organization)s during one month period and based on that information, we could successfully perform large-scale drug repositioning simulations over 2,000 CPU cores.","PeriodicalId":133491,"journal":{"name":"2015 International Conference on Cloud and Autonomic Computing","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on Cloud and Autonomic Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCAC.2015.39","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Computational Grids consist of heterogeneous collections of geographically distributed computing resources and have supported numerous scientific applications that require substantial amounts of computing power and storage space. From the point of view of scientists who want to leverage these Grid computing resources, effectively locating appropriate computing resources with minimized allocation overheads is crucial to successfully execute large-scale scientific applications. However, Grid resource availability is highly unstable and current Grid Information Service (GIS) does not provide accurate state information of computing resources. This can make it very difficult for users and systems (Schedulers, Resource brokers) to schedule the jobs in the Grid system and to map tasks on appropriate available resources. In this paper, we present SCOUT system that can provide scientific users with current state information about Grid computing resources including the number of available CPU cores and average response time to get resources allocated. With the help of SCOUT, we can periodically profile resource availability of the Computing Elements (CE) in Grids and monitor their average response time and performance. It provides a mechanism to find out the number of available CPU cores required for the applications to execute their tasks within shortest expected time which can accelerate the productivity of leveraging Grid computing resources for solving complex and challenging scientific problems. We have performed resource profiling based on SCOUT system on two different VO(Virtual Organization)s during one month period and based on that information, we could successfully perform large-scale drug repositioning simulations over 2,000 CPU cores.

查看原文本刊更多论文

SCOUT:用于大规模科学计算的网格资源监视器和分析器

计算网格由地理上分布的计算资源的异构集合组成，并支持许多需要大量计算能力和存储空间的科学应用程序。从想要利用这些网格计算资源的科学家的角度来看，以最小的分配开销有效地定位适当的计算资源对于成功执行大规模科学应用程序至关重要。然而，网格资源的可用性非常不稳定，当前的网格信息服务(GIS)不能提供准确的计算资源状态信息。这使得用户和系统(调度器、资源代理)很难调度网格系统中的作业，也很难将任务映射到适当的可用资源上。在本文中，我们提出了SCOUT系统，它可以为科学用户提供网格计算资源的当前状态信息，包括可用CPU核数和获得资源分配的平均响应时间。在SCOUT的帮助下，我们可以定期分析网格中计算元素(CE)的资源可用性，并监控它们的平均响应时间和性能。它提供了一种机制来找出应用程序在最短的预期时间内执行任务所需的可用CPU内核的数量，这可以加速利用网格计算资源解决复杂和具有挑战性的科学问题的生产力。我们在两个不同的虚拟组织上进行了为期一个月的基于SCOUT系统的资源分析，基于这些信息，我们可以成功地在2000多个CPU内核上进行大规模的药物重新定位模拟。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2015 International Conference on Cloud and Autonomic Computing

自引率

0.00%

发文量