Workflow Management in a Protein Clustering Application

Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07) Pub Date : 2007-05-14 DOI:10.1109/CCGRID.2007.122

J. L. Vázquez-Poletti, E. Huedo, R. Montero, I. Llorente

引用次数: 6

Abstract

Bioinformatics is demanding more computational resources day after day. The problems proposed by this area are growing in such complexity that traditional computing systems are not able to face them. For solving complex problems which can be divided in tasks with dependencies, a workflow management system must be employed. In this paper, we introduce the use of the workflow management of the GridWay metascheduler for running a Bioinformatics application which implements a complex algorithm performing protein clustering in order to obtain non-redundant protein databases. The use of a general purpose meta-scheduling system will provide the application the fault-tolerance and advance scheduling capabilities needed to execute on a highly dynamic, heterogeneous and faulty environment. The execution results on a production Grid (the EGEE infrastructure) shows the dramatic impact of remote queue waiting times on the application performance; and the critical need of efficient re-scheduling capabilities.

查看原文本刊更多论文

蛋白质集群应用中的工作流管理

生物信息学日益需要更多的计算资源。这个领域提出的问题越来越复杂，以至于传统的计算系统无法面对它们。为了解决复杂的问题，必须采用工作流管理系统，这些问题可以划分为具有依赖关系的任务。在本文中，我们介绍了使用GridWay元调度程序的工作流管理来运行一个生物信息学应用程序，该应用程序实现了一个复杂的算法，执行蛋白质聚类，以获得非冗余的蛋白质数据库。通用元调度系统的使用将为应用程序提供在高度动态、异构和故障环境中执行所需的容错和高级调度功能。生产网格(EGEE基础设施)上的执行结果显示了远程队列等待时间对应用程序性能的巨大影响;以及对高效重新调度能力的迫切需求。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07)

自引率

0.00%

发文量