走向可扩展的特别气候异常搜索

P. Baumann, D. Misev
{"title":"走向可扩展的特别气候异常搜索","authors":"P. Baumann, D. Misev","doi":"10.1145/2447481.2447493","DOIUrl":null,"url":null,"abstract":"Meteorological data contribute significantly to \"Big Data\"; however, not only is their volume ranging into Petabyte sizes for single objects a challenge, but also the number of dimensions -- such general 4-D spatio-temporal data cannot be handled through traditional GIS methods and tools. Actually, climate data tend to transcend these dimensions and add an extra time dimension for the simulation run time, ending up with 5-D data cubes.\n Traditional databases, known for their flexibility and scalability, have proven inadequate due to their lack of support for multi-dimensional rasters. Consequently, file-based implementations are being used for serving such data to the community, rather than databases. This is recently overcome by Array Databases which provide storage and query support for this information category of multi-dimensional rasters, thereby unleashing the scalability and flexibility advantages for climate data management.\n In this contribution, we present a case study where non-trivial analytics functionality on n-D climate data cubes has been established. Storage optimization techniques novel to standard databases allow to tune the system for interactive response in many cases. We briefly introduce the rasdaman database system used, present the database schema and practically important queries use case, and report preliminary performance observations. To the best of our knowledge, this is the first non-academic, real-life deployment of an array database for up to 5-D data sets.","PeriodicalId":416086,"journal":{"name":"International Workshop on Analytics for Big Geospatial Data","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Towards scalable ad-hoc climate anomalies search\",\"authors\":\"P. Baumann, D. Misev\",\"doi\":\"10.1145/2447481.2447493\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Meteorological data contribute significantly to \\\"Big Data\\\"; however, not only is their volume ranging into Petabyte sizes for single objects a challenge, but also the number of dimensions -- such general 4-D spatio-temporal data cannot be handled through traditional GIS methods and tools. Actually, climate data tend to transcend these dimensions and add an extra time dimension for the simulation run time, ending up with 5-D data cubes.\\n Traditional databases, known for their flexibility and scalability, have proven inadequate due to their lack of support for multi-dimensional rasters. Consequently, file-based implementations are being used for serving such data to the community, rather than databases. This is recently overcome by Array Databases which provide storage and query support for this information category of multi-dimensional rasters, thereby unleashing the scalability and flexibility advantages for climate data management.\\n In this contribution, we present a case study where non-trivial analytics functionality on n-D climate data cubes has been established. Storage optimization techniques novel to standard databases allow to tune the system for interactive response in many cases. We briefly introduce the rasdaman database system used, present the database schema and practically important queries use case, and report preliminary performance observations. To the best of our knowledge, this is the first non-academic, real-life deployment of an array database for up to 5-D data sets.\",\"PeriodicalId\":416086,\"journal\":{\"name\":\"International Workshop on Analytics for Big Geospatial Data\",\"volume\":\"27 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-11-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Workshop on Analytics for Big Geospatial Data\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2447481.2447493\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Workshop on Analytics for Big Geospatial Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2447481.2447493","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

摘要

气象数据对“大数据”的贡献显著;然而,不仅单个对象的体积达到pb大小是一个挑战,而且维度的数量也是一个挑战——这种一般的4-D时空数据无法通过传统的GIS方法和工具处理。实际上,气候数据倾向于超越这些维度,并为模拟运行时间添加额外的时间维度,最终得到5维数据立方体。传统数据库以其灵活性和可伸缩性而闻名,但由于缺乏对多维光栅的支持,已被证明是不够的。因此,基于文件的实现被用于向社区(而不是数据库)提供此类数据。数组数据库最近克服了这一问题,它为多维栅格的信息类别提供存储和查询支持,从而释放了气候数据管理的可扩展性和灵活性优势。在本文中,我们提出了一个案例研究,其中在n-D气候数据集上建立了非平凡的分析功能。存储优化技术对于标准数据库来说是新颖的,它允许在许多情况下对系统进行交互式响应调优。我们简要介绍了所使用的rasdaman数据库系统,给出了数据库模式和实际重要的查询用例,并报告了初步的性能观察结果。据我们所知,这是第一个非学术的、现实生活中的阵列数据库部署,最多可用于5-D数据集。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Towards scalable ad-hoc climate anomalies search
Meteorological data contribute significantly to "Big Data"; however, not only is their volume ranging into Petabyte sizes for single objects a challenge, but also the number of dimensions -- such general 4-D spatio-temporal data cannot be handled through traditional GIS methods and tools. Actually, climate data tend to transcend these dimensions and add an extra time dimension for the simulation run time, ending up with 5-D data cubes. Traditional databases, known for their flexibility and scalability, have proven inadequate due to their lack of support for multi-dimensional rasters. Consequently, file-based implementations are being used for serving such data to the community, rather than databases. This is recently overcome by Array Databases which provide storage and query support for this information category of multi-dimensional rasters, thereby unleashing the scalability and flexibility advantages for climate data management. In this contribution, we present a case study where non-trivial analytics functionality on n-D climate data cubes has been established. Storage optimization techniques novel to standard databases allow to tune the system for interactive response in many cases. We briefly introduce the rasdaman database system used, present the database schema and practically important queries use case, and report preliminary performance observations. To the best of our knowledge, this is the first non-academic, real-life deployment of an array database for up to 5-D data sets.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信