SciCumulus:一个轻量级的云中间件,用于探索科学工作流中的许多任务计算范式

Daniel de Oliveira, Eduardo S. Ogasawara, F. Baião, M. Mattoso
{"title":"SciCumulus:一个轻量级的云中间件,用于探索科学工作流中的许多任务计算范式","authors":"Daniel de Oliveira, Eduardo S. Ogasawara, F. Baião, M. Mattoso","doi":"10.1109/CLOUD.2010.64","DOIUrl":null,"url":null,"abstract":"Most of the large-scale scientific experiments modeled as scientific workflows produce a large amount of data and require workflow parallelism to reduce workflow execution time. Some of the existing Scientific Workflow Management Systems (SWfMS) explore parallelism techniques - such as parameter sweep and data fragmentation. In those systems, several computing resources are used to accomplish many computational tasks in homogeneous environments, such as multiprocessor machines or cluster systems. Cloud computing has become a popular high performance computing model in which (virtualized) resources are provided as services over the Web. Some scientists are starting to adopt the cloud model in scientific domains and are moving their scientific workflows (programs and data) from local environments to the cloud. Nevertheless, it is still difficult for the scientist to express a parallel computing paradigm for the workflow on the cloud. Capturing distributed provenance data at the cloud is also an issue. Existing approaches for executing scientific workflows using parallel processing are mainly focused on homogeneous environments whereas, in the cloud, the scientist has to manage new aspects such as initialization of virtualized instances, scheduling over different cloud environments, impact of data transferring and management of instance images. In this paper we propose SciCumulus, a cloud middleware that explores parameter sweep and data fragmentation parallelism in scientific workflow activities (with provenance support). It works between the SWfMS and the cloud. SciCumulus is designed considering cloud specificities. We have evaluated our approach by executing simulated experiments to analyze the overhead imposed by clouds on the workflow execution time.","PeriodicalId":375404,"journal":{"name":"2010 IEEE 3rd International Conference on Cloud Computing","volume":"154 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"167","resultStr":"{\"title\":\"SciCumulus: A Lightweight Cloud Middleware to Explore Many Task Computing Paradigm in Scientific Workflows\",\"authors\":\"Daniel de Oliveira, Eduardo S. Ogasawara, F. Baião, M. Mattoso\",\"doi\":\"10.1109/CLOUD.2010.64\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Most of the large-scale scientific experiments modeled as scientific workflows produce a large amount of data and require workflow parallelism to reduce workflow execution time. Some of the existing Scientific Workflow Management Systems (SWfMS) explore parallelism techniques - such as parameter sweep and data fragmentation. In those systems, several computing resources are used to accomplish many computational tasks in homogeneous environments, such as multiprocessor machines or cluster systems. Cloud computing has become a popular high performance computing model in which (virtualized) resources are provided as services over the Web. Some scientists are starting to adopt the cloud model in scientific domains and are moving their scientific workflows (programs and data) from local environments to the cloud. Nevertheless, it is still difficult for the scientist to express a parallel computing paradigm for the workflow on the cloud. Capturing distributed provenance data at the cloud is also an issue. Existing approaches for executing scientific workflows using parallel processing are mainly focused on homogeneous environments whereas, in the cloud, the scientist has to manage new aspects such as initialization of virtualized instances, scheduling over different cloud environments, impact of data transferring and management of instance images. In this paper we propose SciCumulus, a cloud middleware that explores parameter sweep and data fragmentation parallelism in scientific workflow activities (with provenance support). It works between the SWfMS and the cloud. SciCumulus is designed considering cloud specificities. We have evaluated our approach by executing simulated experiments to analyze the overhead imposed by clouds on the workflow execution time.\",\"PeriodicalId\":375404,\"journal\":{\"name\":\"2010 IEEE 3rd International Conference on Cloud Computing\",\"volume\":\"154 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-07-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"167\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 IEEE 3rd International Conference on Cloud Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CLOUD.2010.64\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE 3rd International Conference on Cloud Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CLOUD.2010.64","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 167

摘要

以科学工作流为模型的大规模科学实验大多产生大量数据,需要工作流并行化以减少工作流执行时间。一些现有的科学工作流管理系统(SWfMS)探索了并行技术,如参数扫描和数据碎片。在这些系统中,使用多个计算资源来完成同构环境中的许多计算任务,例如多处理器机器或集群系统。云计算已经成为一种流行的高性能计算模型,其中(虚拟化)资源通过Web作为服务提供。一些科学家开始在科学领域采用云模型,并将他们的科学工作流程(程序和数据)从本地环境转移到云上。然而,对于科学家来说,在云上表达工作流的并行计算范式仍然是困难的。在云端获取分布式来源数据也是一个问题。使用并行处理执行科学工作流的现有方法主要集中在同质环境中,而在云中,科学家必须管理新的方面,如虚拟实例的初始化、不同云环境的调度、数据传输的影响和实例映像的管理。在本文中,我们提出了SciCumulus,这是一个云中间件,它探索了科学工作流活动中的参数扫描和数据碎片并行性(具有来源支持)。它在SWfMS和云之间工作。SciCumulus的设计考虑了云的特性。我们通过执行模拟实验来评估我们的方法,以分析云对工作流执行时间施加的开销。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
SciCumulus: A Lightweight Cloud Middleware to Explore Many Task Computing Paradigm in Scientific Workflows
Most of the large-scale scientific experiments modeled as scientific workflows produce a large amount of data and require workflow parallelism to reduce workflow execution time. Some of the existing Scientific Workflow Management Systems (SWfMS) explore parallelism techniques - such as parameter sweep and data fragmentation. In those systems, several computing resources are used to accomplish many computational tasks in homogeneous environments, such as multiprocessor machines or cluster systems. Cloud computing has become a popular high performance computing model in which (virtualized) resources are provided as services over the Web. Some scientists are starting to adopt the cloud model in scientific domains and are moving their scientific workflows (programs and data) from local environments to the cloud. Nevertheless, it is still difficult for the scientist to express a parallel computing paradigm for the workflow on the cloud. Capturing distributed provenance data at the cloud is also an issue. Existing approaches for executing scientific workflows using parallel processing are mainly focused on homogeneous environments whereas, in the cloud, the scientist has to manage new aspects such as initialization of virtualized instances, scheduling over different cloud environments, impact of data transferring and management of instance images. In this paper we propose SciCumulus, a cloud middleware that explores parameter sweep and data fragmentation parallelism in scientific workflow activities (with provenance support). It works between the SWfMS and the cloud. SciCumulus is designed considering cloud specificities. We have evaluated our approach by executing simulated experiments to analyze the overhead imposed by clouds on the workflow execution time.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信