2019 15th International Conference on eScience (eScience)最新文献

筛选
英文 中文
Workflow Design Analysis for High Resolution Satellite Image Analysis 高分辨率卫星图像分析的工作流设计分析
2019 15th International Conference on eScience (eScience) Pub Date : 2019-05-23 DOI: 10.1109/eScience.2019.00013
Ioannis Paraskevakos, M. Turilli, B. Gonçalves, H. Lynch, S. Jha
{"title":"Workflow Design Analysis for High Resolution Satellite Image Analysis","authors":"Ioannis Paraskevakos, M. Turilli, B. Gonçalves, H. Lynch, S. Jha","doi":"10.1109/eScience.2019.00013","DOIUrl":"https://doi.org/10.1109/eScience.2019.00013","url":null,"abstract":"Ecological sciences are using imagery from a variety of sources to monitor and survey populations and ecosystems. Very High Resolution (VHR) satellite imagery provide an effective dataset for large scale surveys. Convolutional Neural Networks have successfully been employed to analyze such imagery and detect large animals. As the datasets increase in volume, O(TB), and number of images, O(1k), utilizing High Performance Computing (HPC) resources becomes necessary. In this paper, we investigate a task-parallel data-driven workflows design to support imagery analysis pipelines with heterogeneous tasks on HPC. We analyze the capabilities of each design when processing a dataset of 3,000 VHR satellite images for a total of 4~TB. We experimentally model the execution time of the tasks of the image processing pipeline. We perform experiments to characterize the resource utilization, total time to completion, and overheads of each design. Based on the model, overhead and utilization analysis, we show which design approach to is best suited in scientific pipelines with similar characteristics.","PeriodicalId":142614,"journal":{"name":"2019 15th International Conference on eScience (eScience)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122215666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Custom Execution Environments with Containers in Pegasus-Enabled Scientific Workflows 在pegasus支持的科学工作流中使用容器自定义执行环境
2019 15th International Conference on eScience (eScience) Pub Date : 2019-05-20 DOI: 10.1109/eScience.2019.00039
K. Vahi, M. Rynge, G. Papadimitriou, Duncan A. Brown, R. Mayani, Rafael Ferreira da Silva, E. Deelman, A. Mandal, Eric J. Lyons, M. Zink
{"title":"Custom Execution Environments with Containers in Pegasus-Enabled Scientific Workflows","authors":"K. Vahi, M. Rynge, G. Papadimitriou, Duncan A. Brown, R. Mayani, Rafael Ferreira da Silva, E. Deelman, A. Mandal, Eric J. Lyons, M. Zink","doi":"10.1109/eScience.2019.00039","DOIUrl":"https://doi.org/10.1109/eScience.2019.00039","url":null,"abstract":"Science reproducibility is a cornerstone feature in scientific workflows. In most cases, this has been implemented as a way to exactly reproduce the computational steps taken to reach the final results. While these steps are often completely described, including the input parameters, datasets, and codes, the environment in which these steps are executed is only described at a higher level with endpoints and operating system name and versions. Though this may be sufficient for reproducibility in the short term, systems evolve and are replaced over time, breaking the underlying workflow reproducibility. A natural solution to this problem is containers, as they are well defined, have a lifetime independent of the underlying system, and can be user-controlled so that they can provide custom environments if needed. This paper highlights some unique challenges that may arise when using containers in distributed scientific workflows. Further, this paper explores how the Pegasus Workflow Management System implements container support to address such challenges.","PeriodicalId":142614,"journal":{"name":"2019 15th International Conference on eScience (eScience)","volume":"85 10","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113961429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
SOMOSPIE: A Modular SOil MOisture SPatial Inference Engine Based on Data-Driven Decisions SOMOSPIE:基于数据驱动决策的模块化土壤湿度空间推理引擎
2019 15th International Conference on eScience (eScience) Pub Date : 2019-04-16 DOI: 10.1109/eScience.2019.00008
Danny Rorabaugh, M. Guevara, R. Llamas, J. Kitson, R. Vargas, M. Taufer
{"title":"SOMOSPIE: A Modular SOil MOisture SPatial Inference Engine Based on Data-Driven Decisions","authors":"Danny Rorabaugh, M. Guevara, R. Llamas, J. Kitson, R. Vargas, M. Taufer","doi":"10.1109/eScience.2019.00008","DOIUrl":"https://doi.org/10.1109/eScience.2019.00008","url":null,"abstract":"The current availability of soil moisture data over large areas comes from satellite remote sensing technologies (i.e., radar-based systems), but these data have coarse resolution and often exhibit large spatial information gaps. Where data are too coarse or sparse for a given need (e.g., precision farming), one can leverage machine-learning techniques coupled with other sources of environmental information (e.g., topography) to generate gap-free information at a finer spatial resolution (i.e., increased granularity). To this end, we develop a spatial inference engine consisting of modular stages for processing spatial environmental data, generating predictions with machine-learning techniques, and analyzing these predictions. We demonstrate the functionality of this approach and the effects of data processing choices via multiple prediction maps over a United States ecological region with a highly diverse soil moisture profile (i.e., the Middle Atlantic Coastal Plains). The relevance of our work derives from a pressing need to improve the spatial representation of soil moisture for applications in environmental sciences (e.g., ecological niche modeling, carbon monitoring systems, and other Earth system models) and precision farming (e.g., optimizing irrigation practices and other land management decisions).","PeriodicalId":142614,"journal":{"name":"2019 15th International Conference on eScience (eScience)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128921133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Simulating Data Access Profiles of Computational Jobs in Data Grids 模拟数据网格中计算作业的数据访问概况
2019 15th International Conference on eScience (eScience) Pub Date : 2019-02-26 DOI: 10.1109/eScience.2019.00051
Volodimir Begy, Joeri Hermans, M. Barisits, M. Lassnig, E. Schikuta
{"title":"Simulating Data Access Profiles of Computational Jobs in Data Grids","authors":"Volodimir Begy, Joeri Hermans, M. Barisits, M. Lassnig, E. Schikuta","doi":"10.1109/eScience.2019.00051","DOIUrl":"https://doi.org/10.1109/eScience.2019.00051","url":null,"abstract":"The data access patterns of applications running in computing grids are changing due to the recent proliferation of high-speed local and wide area networks. The data-intensive jobs are no longer strictly required to run at the computing sites, where the respective input data are located. Instead, jobs may access the data employing arbitrary combinations of data-placement, stage-in and remote data access. These data access profiles exhibit partially non-overlapping throughput bottlenecks. This fact can be exploited in order to minimize the time jobs spend waiting for input data. In this work we present a novel grid computing simulator, which puts a heavy emphasis on the various data access profiles. Its purpose is to enable reproducible performance studies on data access patterns. The fundamental assumptions underlying our simulator are justified by empirical experiments performed in the Worldwide LHC Computing Grid (WLCG) at CERN. We demonstrate how to calibrate the simulator parameters in accordance with the true system using posterior inference with likelihood-free Markov Chain Monte Carlo. Thereafter, we validate the simulator's output with respect to authentic production workloads from WLCG, demonstrating its remarkable accuracy.","PeriodicalId":142614,"journal":{"name":"2019 15th International Conference on eScience (eScience)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127583912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信