AtlasHDF: GeoAI的高效大数据框架

M. Werner, Haomin Li
{"title":"AtlasHDF: GeoAI的高效大数据框架","authors":"M. Werner, Haomin Li","doi":"10.1145/3557917.3567615","DOIUrl":null,"url":null,"abstract":"The last decade witnesses a fast development in geospatial application of artificial intelligence (GeoAI). However, due to the misalignment with wider computer science progresses, the geospatial community, for a long time, keeps working with powerful and over-sophisticated tools and software, whose functionality goes far beyond the actual basic need of GeoAI tasks. This fact, to a certain extent, hinders our steps towards establishing future sustainable and replicable GeoAI models. In this paper, we aim to address this challenge by introducing an efficient big data framework based on the modern HDF5 technology, called AtlasHDF, in which we designed lossless data mappings (immediate mapping and analysis-ready mapping) from OpenStreetMap (OSM) vector data into a single HDF5 data container to facilitate fast and flexible GeoAI applications learnt from OSM data. Since the HDF5 is included as a default dependency in most GeoAI and high performance computing (HPC) environments, the proposed AtlasHDF provides a cross-platformm and single-techonology solution of handling heterogeneous big geodata for GeoAI. As a case study, we conducted a comparative analysis of the AtlasHDF framework with three commonly-used data formats (i.e., PBF, Shapefile and GeoPackage) using the latest OSM data from the city of Berlin (Germany), then elaborated on the advantages of each data format w.r.t file size, querying, rending, dependency, data extendability. Given a wide range of GeoAI tasks that can potentially benefit from our framework, our future work will focus on extending the framework to heterogeneous big geodata (vector and raster) to support seamless and fast data integration without any geospatial software dependency until the training stage of GeoAI. A reference implementation of the framework developed in this paper is provided to the public at: https://github.com/tumbgd/hdf4water.","PeriodicalId":152788,"journal":{"name":"Proceedings of the 10th ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"AtlasHDF: an efficient big data framework for GeoAI\",\"authors\":\"M. Werner, Haomin Li\",\"doi\":\"10.1145/3557917.3567615\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The last decade witnesses a fast development in geospatial application of artificial intelligence (GeoAI). However, due to the misalignment with wider computer science progresses, the geospatial community, for a long time, keeps working with powerful and over-sophisticated tools and software, whose functionality goes far beyond the actual basic need of GeoAI tasks. This fact, to a certain extent, hinders our steps towards establishing future sustainable and replicable GeoAI models. In this paper, we aim to address this challenge by introducing an efficient big data framework based on the modern HDF5 technology, called AtlasHDF, in which we designed lossless data mappings (immediate mapping and analysis-ready mapping) from OpenStreetMap (OSM) vector data into a single HDF5 data container to facilitate fast and flexible GeoAI applications learnt from OSM data. Since the HDF5 is included as a default dependency in most GeoAI and high performance computing (HPC) environments, the proposed AtlasHDF provides a cross-platformm and single-techonology solution of handling heterogeneous big geodata for GeoAI. As a case study, we conducted a comparative analysis of the AtlasHDF framework with three commonly-used data formats (i.e., PBF, Shapefile and GeoPackage) using the latest OSM data from the city of Berlin (Germany), then elaborated on the advantages of each data format w.r.t file size, querying, rending, dependency, data extendability. Given a wide range of GeoAI tasks that can potentially benefit from our framework, our future work will focus on extending the framework to heterogeneous big geodata (vector and raster) to support seamless and fast data integration without any geospatial software dependency until the training stage of GeoAI. A reference implementation of the framework developed in this paper is provided to the public at: https://github.com/tumbgd/hdf4water.\",\"PeriodicalId\":152788,\"journal\":{\"name\":\"Proceedings of the 10th ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data\",\"volume\":\"48 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 10th ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3557917.3567615\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 10th ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3557917.3567615","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

近十年来,人工智能(GeoAI)在地理空间领域的应用得到了快速发展。然而,由于与更广泛的计算机科学进展不一致,地理空间社区在很长一段时间内一直在使用强大且过于复杂的工具和软件,其功能远远超出了GeoAI任务的实际基本需求。这一事实在一定程度上阻碍了我们建立未来可持续和可复制的GeoAI模型的步伐。在本文中,我们旨在通过引入基于现代HDF5技术的高效大数据框架(称为AtlasHDF)来解决这一挑战,在该框架中,我们将OpenStreetMap (OSM)矢量数据的无损数据映射(即时映射和分析准备映射)设计到单个HDF5数据容器中,以促进从OSM数据中学习的快速灵活的GeoAI应用程序。由于HDF5是大多数GeoAI和高性能计算(HPC)环境中的默认依赖项,因此拟议的AtlasHDF为GeoAI提供了处理异构大地理数据的跨平台和单一技术解决方案。作为案例研究,我们使用来自德国柏林市的最新OSM数据,对AtlasHDF框架与三种常用数据格式(PBF、Shapefile和geoppackage)进行了比较分析,然后详细阐述了每种数据格式的优点:文件大小、查询、渲染、依赖、数据可扩展性。考虑到广泛的GeoAI任务可能受益于我们的框架,我们未来的工作将侧重于将框架扩展到异构大地理数据(矢量和栅格),以支持无缝和快速的数据集成,而无需任何地理空间软件依赖,直到GeoAI的训练阶段。在本文中开发的框架的参考实现提供给公众:https://github.com/tumbgd/hdf4water。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
AtlasHDF: an efficient big data framework for GeoAI
The last decade witnesses a fast development in geospatial application of artificial intelligence (GeoAI). However, due to the misalignment with wider computer science progresses, the geospatial community, for a long time, keeps working with powerful and over-sophisticated tools and software, whose functionality goes far beyond the actual basic need of GeoAI tasks. This fact, to a certain extent, hinders our steps towards establishing future sustainable and replicable GeoAI models. In this paper, we aim to address this challenge by introducing an efficient big data framework based on the modern HDF5 technology, called AtlasHDF, in which we designed lossless data mappings (immediate mapping and analysis-ready mapping) from OpenStreetMap (OSM) vector data into a single HDF5 data container to facilitate fast and flexible GeoAI applications learnt from OSM data. Since the HDF5 is included as a default dependency in most GeoAI and high performance computing (HPC) environments, the proposed AtlasHDF provides a cross-platformm and single-techonology solution of handling heterogeneous big geodata for GeoAI. As a case study, we conducted a comparative analysis of the AtlasHDF framework with three commonly-used data formats (i.e., PBF, Shapefile and GeoPackage) using the latest OSM data from the city of Berlin (Germany), then elaborated on the advantages of each data format w.r.t file size, querying, rending, dependency, data extendability. Given a wide range of GeoAI tasks that can potentially benefit from our framework, our future work will focus on extending the framework to heterogeneous big geodata (vector and raster) to support seamless and fast data integration without any geospatial software dependency until the training stage of GeoAI. A reference implementation of the framework developed in this paper is provided to the public at: https://github.com/tumbgd/hdf4water.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信