hex2vec:上下文感知嵌入H3六边形与OpenStreetMap标签

Proceedings of the 4th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery Pub Date : 2021-11-01 DOI:10.1145/3486635.3491076

Szymon Wo'zniak, Piotr Szyma'nski

{"title":"hex2vec:上下文感知嵌入H3六边形与OpenStreetMap标签","authors":"Szymon Wo'zniak, Piotr Szyma'nski","doi":"10.1145/3486635.3491076","DOIUrl":null,"url":null,"abstract":"Representation learning of spatial and geographic data is a rapidly developing field which allows for similarity detection between areas and high-quality inference using deep neural networks. Past approaches however concentrated on embedding raster imagery (maps, street or satellite photos), mobility data or road networks. In this paper we propose the first approach to learning vector representations of OpenStreetMap regions with respect to urban functions and land-use in a micro-region grid. We identify a subset of OSM tags related to major characteristics of land-use, building and urban region functions, types of water, green or other natural areas. Through manual verification of tagging quality, we selected 36 cities were for training region representations. Uber's H3 index was used to divide the cities into hexagons, and OSM tags were aggregated for each hexagon. We propose the hex2vec method based on the Skip-gram model with negative sampling. The resulting vector representations showcase semantic structures of the map characteristics, similar to ones found in vector-based language models. We also present insights from region similarity detection in six Polish cities and propose a region typology obtained through agglomerative clustering.","PeriodicalId":448866,"journal":{"name":"Proceedings of the 4th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":"{\"title\":\"hex2vec: Context-Aware Embedding H3 Hexagons with OpenStreetMap Tags\",\"authors\":\"Szymon Wo'zniak, Piotr Szyma'nski\",\"doi\":\"10.1145/3486635.3491076\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Representation learning of spatial and geographic data is a rapidly developing field which allows for similarity detection between areas and high-quality inference using deep neural networks. Past approaches however concentrated on embedding raster imagery (maps, street or satellite photos), mobility data or road networks. In this paper we propose the first approach to learning vector representations of OpenStreetMap regions with respect to urban functions and land-use in a micro-region grid. We identify a subset of OSM tags related to major characteristics of land-use, building and urban region functions, types of water, green or other natural areas. Through manual verification of tagging quality, we selected 36 cities were for training region representations. Uber's H3 index was used to divide the cities into hexagons, and OSM tags were aggregated for each hexagon. We propose the hex2vec method based on the Skip-gram model with negative sampling. The resulting vector representations showcase semantic structures of the map characteristics, similar to ones found in vector-based language models. We also present insights from region similarity detection in six Polish cities and propose a region typology obtained through agglomerative clustering.\",\"PeriodicalId\":448866,\"journal\":{\"name\":\"Proceedings of the 4th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery\",\"volume\":\"35 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"18\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 4th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3486635.3491076\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 4th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3486635.3491076","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 18

摘要

空间和地理数据的表示学习是一个快速发展的领域，它允许区域之间的相似性检测和使用深度神经网络进行高质量的推理。然而，过去的方法集中于嵌入光栅图像(地图、街道或卫星照片)、移动数据或道路网络。在本文中，我们提出了第一种方法来学习OpenStreetMap区域在微区域网格中关于城市功能和土地利用的向量表示。我们确定了与土地利用、建筑和城市区域功能、水、绿色或其他自然区域类型的主要特征相关的OSM标签子集。通过人工验证标注质量，我们选择了36个城市作为训练区域表示。使用Uber的H3指数将城市划分为六边形，并为每个六边形聚合OSM标签。我们提出了基于Skip-gram负采样模型的hex2vec方法。得到的向量表示展示了映射特征的语义结构，类似于基于向量的语言模型。我们还提出了从六个波兰城市的区域相似性检测的见解，并提出了通过聚集聚类获得的区域类型学。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

hex2vec: Context-Aware Embedding H3 Hexagons with OpenStreetMap Tags

Representation learning of spatial and geographic data is a rapidly developing field which allows for similarity detection between areas and high-quality inference using deep neural networks. Past approaches however concentrated on embedding raster imagery (maps, street or satellite photos), mobility data or road networks. In this paper we propose the first approach to learning vector representations of OpenStreetMap regions with respect to urban functions and land-use in a micro-region grid. We identify a subset of OSM tags related to major characteristics of land-use, building and urban region functions, types of water, green or other natural areas. Through manual verification of tagging quality, we selected 36 cities were for training region representations. Uber's H3 index was used to divide the cities into hexagons, and OSM tags were aggregated for each hexagon. We propose the hex2vec method based on the Skip-gram model with negative sampling. The resulting vector representations showcase semantic structures of the map characteristics, similar to ones found in vector-based language models. We also present insights from region similarity detection in six Polish cities and propose a region typology obtained through agglomerative clustering.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 4th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery

自引率

0.00%

发文量