嵌入式领域特定语言和运行时系统,用于渐进式时空数据分析和可视化

Cameron Christensen, Ji-Woo Lee, Shusen Liu, P. Bremer, G. Scorzelli, Valerio Pascucci
{"title":"嵌入式领域特定语言和运行时系统,用于渐进式时空数据分析和可视化","authors":"Cameron Christensen, Ji-Woo Lee, Shusen Liu, P. Bremer, G. Scorzelli, Valerio Pascucci","doi":"10.1109/LDAV.2016.7874304","DOIUrl":null,"url":null,"abstract":"As our ability to generate large and complex datasets grows, accessing and processing these massive data collections is increasingly the primary bottleneck in scientific analysis. Challenges include retrieving, converting, resampling, and combining remote and often disparately located data ensembles with only limited support from existing tools. In particular, existing solutions rely predominantly on extensive data transfers or large-scale remote computing resources, both of which are inherently offline processes with long delays and substantial repercussions for any mistakes. Such workflows severely limit the flexible exploration and rapid evaluation of new hypotheses that are crucial to the scientific process and thereby impede scientific discovery. Here we present an embedded domain-specific language (EDSL) specifically designed for the interactive exploration of large-scale, remote data. Our EDSL allows users to express a wide range of data analysis operations in a simple and abstract manner. The underlying runtime system transparently resolves issues such as remote data access and resampling while at the same time maintaining interactivity through progressive and interruptible computation. This system enables, for the first time, interactive remote exploration of massive datasets such as the 7km NASA GEOS-5 Nature Run simulation, which previously have been analyzed only offline or at reduced resolution.","PeriodicalId":148570,"journal":{"name":"2016 IEEE 6th Symposium on Large Data Analysis and Visualization (LDAV)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Embedded domain-specific language and runtime system for progressive spatiotemporal data analysis and visualization\",\"authors\":\"Cameron Christensen, Ji-Woo Lee, Shusen Liu, P. Bremer, G. Scorzelli, Valerio Pascucci\",\"doi\":\"10.1109/LDAV.2016.7874304\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As our ability to generate large and complex datasets grows, accessing and processing these massive data collections is increasingly the primary bottleneck in scientific analysis. Challenges include retrieving, converting, resampling, and combining remote and often disparately located data ensembles with only limited support from existing tools. In particular, existing solutions rely predominantly on extensive data transfers or large-scale remote computing resources, both of which are inherently offline processes with long delays and substantial repercussions for any mistakes. Such workflows severely limit the flexible exploration and rapid evaluation of new hypotheses that are crucial to the scientific process and thereby impede scientific discovery. Here we present an embedded domain-specific language (EDSL) specifically designed for the interactive exploration of large-scale, remote data. Our EDSL allows users to express a wide range of data analysis operations in a simple and abstract manner. The underlying runtime system transparently resolves issues such as remote data access and resampling while at the same time maintaining interactivity through progressive and interruptible computation. This system enables, for the first time, interactive remote exploration of massive datasets such as the 7km NASA GEOS-5 Nature Run simulation, which previously have been analyzed only offline or at reduced resolution.\",\"PeriodicalId\":148570,\"journal\":{\"name\":\"2016 IEEE 6th Symposium on Large Data Analysis and Visualization (LDAV)\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-08-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE 6th Symposium on Large Data Analysis and Visualization (LDAV)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/LDAV.2016.7874304\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE 6th Symposium on Large Data Analysis and Visualization (LDAV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/LDAV.2016.7874304","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

摘要

随着我们生成大型复杂数据集的能力不断增强,访问和处理这些海量数据集日益成为科学分析的主要瓶颈。挑战包括检索、转换、重新采样,以及在现有工具支持有限的情况下,将远程和通常不同位置的数据集成在一起。特别是,现有的解决方案主要依赖于大量的数据传输或大规模的远程计算资源,这两者本质上都是离线过程,具有长时间的延迟和对任何错误的严重影响。这种工作流程严重限制了对科学过程至关重要的新假设的灵活探索和快速评估,从而阻碍了科学发现。在这里,我们提出了一种嵌入式领域特定语言(EDSL),专门为大规模远程数据的交互式探索而设计。我们的EDSL允许用户以简单和抽象的方式表达广泛的数据分析操作。底层运行时系统透明地解决诸如远程数据访问和重新采样等问题,同时通过渐进式和可中断的计算保持交互性。该系统首次实现了对大型数据集的交互式远程探索,例如NASA GEOS-5自然运行模拟的7公里数据集,这些数据集以前只能在离线或低分辨率下进行分析。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Embedded domain-specific language and runtime system for progressive spatiotemporal data analysis and visualization
As our ability to generate large and complex datasets grows, accessing and processing these massive data collections is increasingly the primary bottleneck in scientific analysis. Challenges include retrieving, converting, resampling, and combining remote and often disparately located data ensembles with only limited support from existing tools. In particular, existing solutions rely predominantly on extensive data transfers or large-scale remote computing resources, both of which are inherently offline processes with long delays and substantial repercussions for any mistakes. Such workflows severely limit the flexible exploration and rapid evaluation of new hypotheses that are crucial to the scientific process and thereby impede scientific discovery. Here we present an embedded domain-specific language (EDSL) specifically designed for the interactive exploration of large-scale, remote data. Our EDSL allows users to express a wide range of data analysis operations in a simple and abstract manner. The underlying runtime system transparently resolves issues such as remote data access and resampling while at the same time maintaining interactivity through progressive and interruptible computation. This system enables, for the first time, interactive remote exploration of massive datasets such as the 7km NASA GEOS-5 Nature Run simulation, which previously have been analyzed only offline or at reduced resolution.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信