全球栅格数据集的人工智能增强概率数据结构

IF 1.2 Q4 REMOTE SENSING
M. Werner
{"title":"全球栅格数据集的人工智能增强概率数据结构","authors":"M. Werner","doi":"10.1145/3453184","DOIUrl":null,"url":null,"abstract":"In the last decade, more and more spatial data has been acquired on a global scale due to satellite missions, social media, and coordinated governmental activities. This observational data suffers from huge storage footprints and makes global analysis challenging. Therefore, many information products have been designed in which observations are turned into global maps showing features such as land cover or land use, often with only a few discrete values and sparse spatial coverage like only within cities. Traditional coding of such data as a raster image becomes challenging due to the sizes of the datasets and spatially non-local access patterns, for example, when labeling social media streams. This article proposes GloBiMap, a randomized data structure, based on Bloom filters, for modeling low-cardinality sparse raster images of excessive sizes in a configurable amount of memory with pure random access operations avoiding costly intermediate decompression. In addition, the data structure is designed to correct the inevitable errors of the randomized layer in order to have a fully exact representation. We show the feasibility of the approach on several real-world datasets including the Global Urban Footprint in which each pixel denotes whether a particular location contains a building at a resolution of roughly 10m globally as well as on a global Twitter sample of more than 220 million precisely geolocated tweets. In addition, we propose the integration of a denoiser engine based on artificial intelligence in order to reduce the amount of error correction information for extremely compressive GloBiMaps.","PeriodicalId":43641,"journal":{"name":"ACM Transactions on Spatial Algorithms and Systems","volume":" ","pages":"1 - 24"},"PeriodicalIF":1.2000,"publicationDate":"2021-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/3453184","citationCount":"1","resultStr":"{\"title\":\"GloBiMapsAI: An AI-Enhanced Probabilistic Data Structure for Global Raster Datasets\",\"authors\":\"M. Werner\",\"doi\":\"10.1145/3453184\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the last decade, more and more spatial data has been acquired on a global scale due to satellite missions, social media, and coordinated governmental activities. This observational data suffers from huge storage footprints and makes global analysis challenging. Therefore, many information products have been designed in which observations are turned into global maps showing features such as land cover or land use, often with only a few discrete values and sparse spatial coverage like only within cities. Traditional coding of such data as a raster image becomes challenging due to the sizes of the datasets and spatially non-local access patterns, for example, when labeling social media streams. This article proposes GloBiMap, a randomized data structure, based on Bloom filters, for modeling low-cardinality sparse raster images of excessive sizes in a configurable amount of memory with pure random access operations avoiding costly intermediate decompression. In addition, the data structure is designed to correct the inevitable errors of the randomized layer in order to have a fully exact representation. We show the feasibility of the approach on several real-world datasets including the Global Urban Footprint in which each pixel denotes whether a particular location contains a building at a resolution of roughly 10m globally as well as on a global Twitter sample of more than 220 million precisely geolocated tweets. In addition, we propose the integration of a denoiser engine based on artificial intelligence in order to reduce the amount of error correction information for extremely compressive GloBiMaps.\",\"PeriodicalId\":43641,\"journal\":{\"name\":\"ACM Transactions on Spatial Algorithms and Systems\",\"volume\":\" \",\"pages\":\"1 - 24\"},\"PeriodicalIF\":1.2000,\"publicationDate\":\"2021-06-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1145/3453184\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Spatial Algorithms and Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3453184\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"REMOTE SENSING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Spatial Algorithms and Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3453184","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"REMOTE SENSING","Score":null,"Total":0}
引用次数: 1

摘要

在过去的十年里,由于卫星任务、社交媒体和协调的政府活动,在全球范围内获得了越来越多的空间数据。这些观测数据存在巨大的存储足迹,使全球分析具有挑战性。因此,已经设计了许多信息产品,将观测结果转化为显示土地覆盖或土地利用等特征的全球地图,通常只有几个离散值,空间覆盖范围稀疏,就像只在城市内一样。例如,在标记社交媒体流时,由于数据集的大小和空间上的非本地访问模式,将这种数据编码为光栅图像变得具有挑战性。本文提出了GloBiMap,这是一种基于Bloom滤波器的随机数据结构,用于在可配置的内存量中建模过大的低基数稀疏光栅图像,采用纯随机访问操作,避免了昂贵的中间解压缩。此外,数据结构被设计为校正随机化层的不可避免的错误,以便具有完全精确的表示。我们在几个真实世界的数据集上展示了该方法的可行性,包括全球城市足迹,其中每个像素表示特定位置是否包含全球分辨率约为1000米的建筑,以及在超过2.2亿条精确地理定位推文的全球推特样本上。此外,我们建议集成基于人工智能的去噪器引擎,以减少极压缩GloBiMaps的纠错信息量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
GloBiMapsAI: An AI-Enhanced Probabilistic Data Structure for Global Raster Datasets
In the last decade, more and more spatial data has been acquired on a global scale due to satellite missions, social media, and coordinated governmental activities. This observational data suffers from huge storage footprints and makes global analysis challenging. Therefore, many information products have been designed in which observations are turned into global maps showing features such as land cover or land use, often with only a few discrete values and sparse spatial coverage like only within cities. Traditional coding of such data as a raster image becomes challenging due to the sizes of the datasets and spatially non-local access patterns, for example, when labeling social media streams. This article proposes GloBiMap, a randomized data structure, based on Bloom filters, for modeling low-cardinality sparse raster images of excessive sizes in a configurable amount of memory with pure random access operations avoiding costly intermediate decompression. In addition, the data structure is designed to correct the inevitable errors of the randomized layer in order to have a fully exact representation. We show the feasibility of the approach on several real-world datasets including the Global Urban Footprint in which each pixel denotes whether a particular location contains a building at a resolution of roughly 10m globally as well as on a global Twitter sample of more than 220 million precisely geolocated tweets. In addition, we propose the integration of a denoiser engine based on artificial intelligence in order to reduce the amount of error correction information for extremely compressive GloBiMaps.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
4.40
自引率
5.30%
发文量
43
期刊介绍: ACM Transactions on Spatial Algorithms and Systems (TSAS) is a scholarly journal that publishes the highest quality papers on all aspects of spatial algorithms and systems and closely related disciplines. It has a multi-disciplinary perspective in that it spans a large number of areas where spatial data is manipulated or visualized (regardless of how it is specified - i.e., geometrically or textually) such as geography, geographic information systems (GIS), geospatial and spatiotemporal databases, spatial and metric indexing, location-based services, web-based spatial applications, geographic information retrieval (GIR), spatial reasoning and mining, security and privacy, as well as the related visual computing areas of computer graphics, computer vision, geometric modeling, and visualization where the spatial, geospatial, and spatiotemporal data is central.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信