Comparison of resource platform selection approaches for scientific workflows

Yogesh L. Simmhan, L. Ramakrishnan
{"title":"Comparison of resource platform selection approaches for scientific workflows","authors":"Yogesh L. Simmhan, L. Ramakrishnan","doi":"10.1145/1851476.1851541","DOIUrl":null,"url":null,"abstract":"Cloud computing is increasingly considered as an additional computational resource platform for scientific workflows. The cloud offers opportunity to scale-out applications from desktops and local cluster resources. Each platform has different properties (e.g., queue wait times in high performance systems, virtual machine startup overhead in clouds) and characteristics (e.g., custom environments in cloud) that makes choosing from these diverse resource platforms for a workflow execution a challenge for scientists. Scientists are often faced with deciding resource platform selection trade-offs with limited information on the actual workflows. While many workflow planning methods have explored resource selection or task scheduling, these methods often require fine-scale characterization of the workflow that is onerous for a scientist. In this paper, we describe our early exploratory work in using blackbox characteristics for a cost-benefit analysis of using different resource platforms. In our blackbox method, we use only limited high-level information on the workflow length, width, and data sizes. The length and width are indicative of the workflow duration and parallelism. We compare the effectiveness of this approach to other resource selection models using two exemplar scientific workflows on desktop, local cluster, HPC center, and cloud platforms. Early results suggest that the blackbox model often makes the same resource selections as a more fine-grained whitebox model. We believe the simplicity of the blackbox model can help inform a scientist on the applicability of a new resource platform, such as cloud resources, even before porting an existing workflow.","PeriodicalId":330072,"journal":{"name":"IEEE International Symposium on High-Performance Parallel Distributed Computing","volume":"297 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE International Symposium on High-Performance Parallel Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1851476.1851541","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19

Abstract

Cloud computing is increasingly considered as an additional computational resource platform for scientific workflows. The cloud offers opportunity to scale-out applications from desktops and local cluster resources. Each platform has different properties (e.g., queue wait times in high performance systems, virtual machine startup overhead in clouds) and characteristics (e.g., custom environments in cloud) that makes choosing from these diverse resource platforms for a workflow execution a challenge for scientists. Scientists are often faced with deciding resource platform selection trade-offs with limited information on the actual workflows. While many workflow planning methods have explored resource selection or task scheduling, these methods often require fine-scale characterization of the workflow that is onerous for a scientist. In this paper, we describe our early exploratory work in using blackbox characteristics for a cost-benefit analysis of using different resource platforms. In our blackbox method, we use only limited high-level information on the workflow length, width, and data sizes. The length and width are indicative of the workflow duration and parallelism. We compare the effectiveness of this approach to other resource selection models using two exemplar scientific workflows on desktop, local cluster, HPC center, and cloud platforms. Early results suggest that the blackbox model often makes the same resource selections as a more fine-grained whitebox model. We believe the simplicity of the blackbox model can help inform a scientist on the applicability of a new resource platform, such as cloud resources, even before porting an existing workflow.
科学工作流资源平台选择方法的比较
云计算越来越被认为是科学工作流程的额外计算资源平台。云提供了从桌面和本地集群资源向外扩展应用程序的机会。每个平台都有不同的属性(例如,高性能系统中的队列等待时间,云中的虚拟机启动开销)和特征(例如,云中的自定义环境),这使得从这些不同的资源平台中选择执行工作流对科学家来说是一个挑战。科学家经常面临在实际工作流程信息有限的情况下决定资源平台选择的权衡。虽然许多工作流规划方法已经探索了资源选择或任务调度,但这些方法通常需要对工作流进行精细的表征,这对科学家来说是繁重的。在本文中,我们描述了我们在使用黑箱特征进行使用不同资源平台的成本效益分析方面的早期探索性工作。在我们的黑盒方法中,我们只使用有限的关于工作流长度、宽度和数据大小的高级信息。长度和宽度表示工作流持续时间和并行性。我们使用桌面、本地集群、高性能计算中心和云平台上的两个示例科学工作流,比较了这种方法与其他资源选择模型的有效性。早期的结果表明,黑盒模型经常做出与更细粒度的白盒模型相同的资源选择。我们相信,黑盒模型的简单性可以帮助科学家了解新资源平台(如云资源)的适用性,甚至在移植现有工作流之前。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信