File-Access Characteristics of Data-Intensive Workflow Applications

Takeshi Shibata, SungJun Choi, K. Taura
{"title":"File-Access Characteristics of Data-Intensive Workflow Applications","authors":"Takeshi Shibata, SungJun Choi, K. Taura","doi":"10.1109/CCGRID.2010.77","DOIUrl":null,"url":null,"abstract":"This paper studies five real-world data intensive workflow applications in the fields of natural language processing, astronomy image analysis, and web data analysis. Data intensive workflows are increasingly becoming important applications for cluster and Grid environments. They open new challenges to various components of workflow execution environments including job dispatchers, schedulers, file systems, and file staging tools. Their impacts on real workloads are largely unknown. Under- standing characteristics of real-world workflow applications is a required step to promote research in this area. To this end, we analyse real-world workflow applications focusing on their file access patterns and summarize their implications to schedulers and file system/staging designs.","PeriodicalId":444485,"journal":{"name":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCGRID.2010.77","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

This paper studies five real-world data intensive workflow applications in the fields of natural language processing, astronomy image analysis, and web data analysis. Data intensive workflows are increasingly becoming important applications for cluster and Grid environments. They open new challenges to various components of workflow execution environments including job dispatchers, schedulers, file systems, and file staging tools. Their impacts on real workloads are largely unknown. Under- standing characteristics of real-world workflow applications is a required step to promote research in this area. To this end, we analyse real-world workflow applications focusing on their file access patterns and summarize their implications to schedulers and file system/staging designs.
数据密集型工作流应用程序的文件访问特性
本文研究了自然语言处理、天文图像分析和web数据分析等五个实际数据密集型工作流应用。数据密集型工作流正日益成为集群和网格环境中的重要应用。它们对工作流执行环境的各种组件(包括作业调度器、调度器、文件系统和文件分级工具)提出了新的挑战。它们对实际工作负载的影响在很大程度上是未知的。了解实际工作流程应用的特点是推动这一领域研究的必要步骤。为此,我们分析了现实世界中的工作流应用程序,重点关注它们的文件访问模式,并总结了它们对调度器和文件系统/分级设计的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信