Streaming satellite data to cloud workflows for on-demand computing of environmental data products

Daniel Zinn, Q. Hart, Bertram Ludäscher, Yogesh L. Simmhan
{"title":"Streaming satellite data to cloud workflows for on-demand computing of environmental data products","authors":"Daniel Zinn, Q. Hart, Bertram Ludäscher, Yogesh L. Simmhan","doi":"10.1109/WORKS.2010.5671841","DOIUrl":null,"url":null,"abstract":"Environmental data arriving constantly from satellites and weather stations are used to compute weather coefficients that are essential for agriculture and viticulture. For example, the reference evapotranspiration (ET0) coefficient, overlaid on regional maps, is provided each day by the California Department of Water Resources to local farmers and turf managers to plan daily water use. Scaling out single-processor compute/data intensive applications operating on realtime data to support more users and higher-resolution data poses data engineering challenges. Cloud computing helps data providers expand resource capacity to meet growing needs besides supporting scientific needs like reprocessing historic data using new models. In this article, we examine migration of a legacy script used for daily ET0 computation by CIMIS to a workflow model that eases deployment to and scaling on the Windows Azure Cloud. Our architecture incorporates a direct streaming model into Cloud virtual machines (VMs) that improves the performance by 130% to 160% for our workflow over using Cloud storage for data staging, used commonly. The streaming workflows achieve runtimes comparable to desktop execution for single VMs and a linear speed-up when using multiple VMs, thus allowing computation of environmental coefficients at a much larger resolution than done presently.","PeriodicalId":400999,"journal":{"name":"The 5th Workshop on Workflows in Support of Large-Scale Science","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The 5th Workshop on Workflows in Support of Large-Scale Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WORKS.2010.5671841","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

Abstract

Environmental data arriving constantly from satellites and weather stations are used to compute weather coefficients that are essential for agriculture and viticulture. For example, the reference evapotranspiration (ET0) coefficient, overlaid on regional maps, is provided each day by the California Department of Water Resources to local farmers and turf managers to plan daily water use. Scaling out single-processor compute/data intensive applications operating on realtime data to support more users and higher-resolution data poses data engineering challenges. Cloud computing helps data providers expand resource capacity to meet growing needs besides supporting scientific needs like reprocessing historic data using new models. In this article, we examine migration of a legacy script used for daily ET0 computation by CIMIS to a workflow model that eases deployment to and scaling on the Windows Azure Cloud. Our architecture incorporates a direct streaming model into Cloud virtual machines (VMs) that improves the performance by 130% to 160% for our workflow over using Cloud storage for data staging, used commonly. The streaming workflows achieve runtimes comparable to desktop execution for single VMs and a linear speed-up when using multiple VMs, thus allowing computation of environmental coefficients at a much larger resolution than done presently.
将卫星数据流式传输到云工作流程,用于按需计算环境数据产品
从卫星和气象站不断传来的环境数据被用来计算对农业和葡萄栽培至关重要的天气系数。例如,加利福尼亚水资源部每天向当地农民和草坪管理者提供覆盖在区域地图上的参考蒸散发(ET0)系数,以规划日常用水。扩展单处理器计算/数据密集型应用程序以支持更多用户和更高分辨率的数据,这给数据工程带来了挑战。云计算帮助数据提供商扩展资源容量,以满足不断增长的需求,此外还支持使用新模型重新处理历史数据等科学需求。在本文中,我们将研究将CIMIS用于日常ET0计算的遗留脚本迁移到一个工作流模型,该模型简化了在Windows Azure云上的部署和扩展。我们的架构将直接流模型集成到云虚拟机(vm)中,与使用云存储进行数据暂存相比,它将我们的工作流性能提高了130%到160%。流式工作流程的运行时间与单个vm的桌面执行相当,并且在使用多个vm时实现线性加速,从而允许以比目前更大的分辨率计算环境系数。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信