科学工作流的有效资源复用

T. Ryan, Young Choon Lee
{"title":"科学工作流的有效资源复用","authors":"T. Ryan, Young Choon Lee","doi":"10.1109/APNOMS.2015.7275432","DOIUrl":null,"url":null,"abstract":"Scientific workflows feature complex precedence constraints that are mostly dictated by data dependencies between tasks. The inter-task communication (data staging) in these complex workflow applications incurs significant overheads resulting in a major hindering factor of high performance and effective resource utilization. As the scale of these applications becomes increasingly large due primarily to the recent explosive growth of data, addressing this hindrance is of great practical importance. In this paper, we present a resource multiplexing (RM) technique, which leverages data staging aiming to minimize idle times between execution of tasks due to inter-task communication overheads. In particular, we incorporate RM into our DEWE framework 1 with making a set of extensions to the framework. The rationale behind RM is each slot or core pairs up the actual workflow task and the RM-enabled file loading DEWE extension (File Client) in the way that their resource usage is complementary. We demonstrate the efficacy of our multiplexing technique in a data-intensive computing environment using an astronomy application. Our results from experiments conducted in Amazon EC2 demonstrate that our multiplexing technique is effective with the reduction in resource idle time between jobs by 57% on average and up to 91%.","PeriodicalId":269263,"journal":{"name":"2015 17th Asia-Pacific Network Operations and Management Symposium (APNOMS)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Effective resource multiplexing for scientific workflows\",\"authors\":\"T. Ryan, Young Choon Lee\",\"doi\":\"10.1109/APNOMS.2015.7275432\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Scientific workflows feature complex precedence constraints that are mostly dictated by data dependencies between tasks. The inter-task communication (data staging) in these complex workflow applications incurs significant overheads resulting in a major hindering factor of high performance and effective resource utilization. As the scale of these applications becomes increasingly large due primarily to the recent explosive growth of data, addressing this hindrance is of great practical importance. In this paper, we present a resource multiplexing (RM) technique, which leverages data staging aiming to minimize idle times between execution of tasks due to inter-task communication overheads. In particular, we incorporate RM into our DEWE framework 1 with making a set of extensions to the framework. The rationale behind RM is each slot or core pairs up the actual workflow task and the RM-enabled file loading DEWE extension (File Client) in the way that their resource usage is complementary. We demonstrate the efficacy of our multiplexing technique in a data-intensive computing environment using an astronomy application. Our results from experiments conducted in Amazon EC2 demonstrate that our multiplexing technique is effective with the reduction in resource idle time between jobs by 57% on average and up to 91%.\",\"PeriodicalId\":269263,\"journal\":{\"name\":\"2015 17th Asia-Pacific Network Operations and Management Symposium (APNOMS)\",\"volume\":\"57 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 17th Asia-Pacific Network Operations and Management Symposium (APNOMS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/APNOMS.2015.7275432\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 17th Asia-Pacific Network Operations and Management Symposium (APNOMS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APNOMS.2015.7275432","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

科学工作流具有复杂的优先约束,这些约束主要由任务之间的数据依赖决定。在这些复杂的工作流应用程序中,任务间通信(数据分段)会产生大量开销,从而成为高性能和有效资源利用的主要阻碍因素。由于最近数据的爆炸性增长,这些应用程序的规模变得越来越大,因此解决这一障碍具有重要的实际意义。在本文中,我们提出了一种资源复用(RM)技术,该技术利用数据分段,旨在最大限度地减少由于任务间通信开销而导致的任务执行之间的空闲时间。特别地,我们将RM合并到我们的DEWE框架1中,并对框架进行了一组扩展。RM背后的基本原理是,每个插槽或核心以资源使用互补的方式将实际的工作流任务和RM支持的文件加载DEWE扩展(文件客户端)配对。我们在一个天文学应用程序的数据密集型计算环境中演示了我们的多路复用技术的有效性。我们在Amazon EC2中进行的实验结果表明,我们的多路复用技术是有效的,可以将作业之间的资源空闲时间平均减少57%,最高可减少91%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Effective resource multiplexing for scientific workflows
Scientific workflows feature complex precedence constraints that are mostly dictated by data dependencies between tasks. The inter-task communication (data staging) in these complex workflow applications incurs significant overheads resulting in a major hindering factor of high performance and effective resource utilization. As the scale of these applications becomes increasingly large due primarily to the recent explosive growth of data, addressing this hindrance is of great practical importance. In this paper, we present a resource multiplexing (RM) technique, which leverages data staging aiming to minimize idle times between execution of tasks due to inter-task communication overheads. In particular, we incorporate RM into our DEWE framework 1 with making a set of extensions to the framework. The rationale behind RM is each slot or core pairs up the actual workflow task and the RM-enabled file loading DEWE extension (File Client) in the way that their resource usage is complementary. We demonstrate the efficacy of our multiplexing technique in a data-intensive computing environment using an astronomy application. Our results from experiments conducted in Amazon EC2 demonstrate that our multiplexing technique is effective with the reduction in resource idle time between jobs by 57% on average and up to 91%.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信