基于相邻依赖划分和列计算的并行同址模式挖掘

Proceedings of the 29th International Conference on Advances in Geographic Information Systems Pub Date : 2021-11-02 DOI:10.1145/3474717.3483984

Peizhong Yang, Lizhen Wang, Xiaoxuan Wang, Lihua Zhou, Hongmei Chen

{"title":"基于相邻依赖划分和列计算的并行同址模式挖掘","authors":"Peizhong Yang, Lizhen Wang, Xiaoxuan Wang, Lihua Zhou, Hongmei Chen","doi":"10.1145/3474717.3483984","DOIUrl":null,"url":null,"abstract":"A co-location pattern is a subset of spatial features whose instances are frequently located together in proximate areas. Mining co-location patterns can discover spatial dependencies in spatial datasets and have particular value in many applications. However, it is challengeable to discover co-location patterns from massive spatial datasets, due to the expensive computational cost. In this paper, we present a novel parallel co-location pattern mining approach. First, dividing spatial neighbor relationships into some neighbor-dependency partitions enables to perform mining task on each partition independently in parallel. Then, a column-based calculation approach is proposed to replace the time-consuming generation of table instances for calculating the prevalence of patterns. To further reduce the search space of patterns on each partition, two pruning strategies are suggested. We implement the parallel co-location pattern mining algorithm based on neighbor-dependency partition and column calculation via MapReduce, named PCPM-NDPCC. Substantial experiments are conducted on real and synthetic datasets to examine the performance of PCPM-NDPCC. Experimental results reveal that PCPM-NDPCC has a significant improvement in efficiency than baseline algorithms and shows better scalability for massive spatial data processing.","PeriodicalId":340759,"journal":{"name":"Proceedings of the 29th International Conference on Advances in Geographic Information Systems","volume":"90 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Parallel Co-location Pattern Mining based on Neighbor-Dependency Partition and Column Calculation\",\"authors\":\"Peizhong Yang, Lizhen Wang, Xiaoxuan Wang, Lihua Zhou, Hongmei Chen\",\"doi\":\"10.1145/3474717.3483984\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A co-location pattern is a subset of spatial features whose instances are frequently located together in proximate areas. Mining co-location patterns can discover spatial dependencies in spatial datasets and have particular value in many applications. However, it is challengeable to discover co-location patterns from massive spatial datasets, due to the expensive computational cost. In this paper, we present a novel parallel co-location pattern mining approach. First, dividing spatial neighbor relationships into some neighbor-dependency partitions enables to perform mining task on each partition independently in parallel. Then, a column-based calculation approach is proposed to replace the time-consuming generation of table instances for calculating the prevalence of patterns. To further reduce the search space of patterns on each partition, two pruning strategies are suggested. We implement the parallel co-location pattern mining algorithm based on neighbor-dependency partition and column calculation via MapReduce, named PCPM-NDPCC. Substantial experiments are conducted on real and synthetic datasets to examine the performance of PCPM-NDPCC. Experimental results reveal that PCPM-NDPCC has a significant improvement in efficiency than baseline algorithms and shows better scalability for massive spatial data processing.\",\"PeriodicalId\":340759,\"journal\":{\"name\":\"Proceedings of the 29th International Conference on Advances in Geographic Information Systems\",\"volume\":\"90 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 29th International Conference on Advances in Geographic Information Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3474717.3483984\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 29th International Conference on Advances in Geographic Information Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3474717.3483984","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

同位模式是空间特征的子集，其实例经常位于邻近区域。挖掘同位模式可以发现空间数据集中的空间依赖关系，在许多应用中具有特殊的价值。然而，由于计算成本昂贵，从大量空间数据集中发现共定位模式是具有挑战性的。在本文中，我们提出了一种新的并行共定位模式挖掘方法。首先，将空间邻居关系划分为一些邻居依赖的分区，可以在每个分区上独立并行地执行挖掘任务。然后，提出了一种基于列的计算方法，以取代耗时的表实例生成来计算模式的流行程度。为了进一步减少每个分区上模式的搜索空间，提出了两种剪枝策略。通过MapReduce实现基于邻域依赖分区和列计算的并行共址模式挖掘算法PCPM-NDPCC。在真实数据集和合成数据集上进行了大量实验，以检验PCPM-NDPCC的性能。实验结果表明，与基准算法相比，PCPM-NDPCC算法的效率有显著提高，并且在处理海量空间数据方面具有更好的可扩展性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Parallel Co-location Pattern Mining based on Neighbor-Dependency Partition and Column Calculation

A co-location pattern is a subset of spatial features whose instances are frequently located together in proximate areas. Mining co-location patterns can discover spatial dependencies in spatial datasets and have particular value in many applications. However, it is challengeable to discover co-location patterns from massive spatial datasets, due to the expensive computational cost. In this paper, we present a novel parallel co-location pattern mining approach. First, dividing spatial neighbor relationships into some neighbor-dependency partitions enables to perform mining task on each partition independently in parallel. Then, a column-based calculation approach is proposed to replace the time-consuming generation of table instances for calculating the prevalence of patterns. To further reduce the search space of patterns on each partition, two pruning strategies are suggested. We implement the parallel co-location pattern mining algorithm based on neighbor-dependency partition and column calculation via MapReduce, named PCPM-NDPCC. Substantial experiments are conducted on real and synthetic datasets to examine the performance of PCPM-NDPCC. Experimental results reveal that PCPM-NDPCC has a significant improvement in efficiency than baseline algorithms and shows better scalability for massive spatial data processing.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 29th International Conference on Advances in Geographic Information Systems

自引率

0.00%

发文量