基于相邻依赖划分和列计算的并行同址模式挖掘

Peizhong Yang, Lizhen Wang, Xiaoxuan Wang, Lihua Zhou, Hongmei Chen
{"title":"基于相邻依赖划分和列计算的并行同址模式挖掘","authors":"Peizhong Yang, Lizhen Wang, Xiaoxuan Wang, Lihua Zhou, Hongmei Chen","doi":"10.1145/3474717.3483984","DOIUrl":null,"url":null,"abstract":"A co-location pattern is a subset of spatial features whose instances are frequently located together in proximate areas. Mining co-location patterns can discover spatial dependencies in spatial datasets and have particular value in many applications. However, it is challengeable to discover co-location patterns from massive spatial datasets, due to the expensive computational cost. In this paper, we present a novel parallel co-location pattern mining approach. First, dividing spatial neighbor relationships into some neighbor-dependency partitions enables to perform mining task on each partition independently in parallel. Then, a column-based calculation approach is proposed to replace the time-consuming generation of table instances for calculating the prevalence of patterns. To further reduce the search space of patterns on each partition, two pruning strategies are suggested. We implement the parallel co-location pattern mining algorithm based on neighbor-dependency partition and column calculation via MapReduce, named PCPM-NDPCC. Substantial experiments are conducted on real and synthetic datasets to examine the performance of PCPM-NDPCC. Experimental results reveal that PCPM-NDPCC has a significant improvement in efficiency than baseline algorithms and shows better scalability for massive spatial data processing.","PeriodicalId":340759,"journal":{"name":"Proceedings of the 29th International Conference on Advances in Geographic Information Systems","volume":"90 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Parallel Co-location Pattern Mining based on Neighbor-Dependency Partition and Column Calculation\",\"authors\":\"Peizhong Yang, Lizhen Wang, Xiaoxuan Wang, Lihua Zhou, Hongmei Chen\",\"doi\":\"10.1145/3474717.3483984\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A co-location pattern is a subset of spatial features whose instances are frequently located together in proximate areas. Mining co-location patterns can discover spatial dependencies in spatial datasets and have particular value in many applications. However, it is challengeable to discover co-location patterns from massive spatial datasets, due to the expensive computational cost. In this paper, we present a novel parallel co-location pattern mining approach. First, dividing spatial neighbor relationships into some neighbor-dependency partitions enables to perform mining task on each partition independently in parallel. Then, a column-based calculation approach is proposed to replace the time-consuming generation of table instances for calculating the prevalence of patterns. To further reduce the search space of patterns on each partition, two pruning strategies are suggested. We implement the parallel co-location pattern mining algorithm based on neighbor-dependency partition and column calculation via MapReduce, named PCPM-NDPCC. Substantial experiments are conducted on real and synthetic datasets to examine the performance of PCPM-NDPCC. Experimental results reveal that PCPM-NDPCC has a significant improvement in efficiency than baseline algorithms and shows better scalability for massive spatial data processing.\",\"PeriodicalId\":340759,\"journal\":{\"name\":\"Proceedings of the 29th International Conference on Advances in Geographic Information Systems\",\"volume\":\"90 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 29th International Conference on Advances in Geographic Information Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3474717.3483984\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 29th International Conference on Advances in Geographic Information Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3474717.3483984","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

同位模式是空间特征的子集,其实例经常位于邻近区域。挖掘同位模式可以发现空间数据集中的空间依赖关系,在许多应用中具有特殊的价值。然而,由于计算成本昂贵,从大量空间数据集中发现共定位模式是具有挑战性的。在本文中,我们提出了一种新的并行共定位模式挖掘方法。首先,将空间邻居关系划分为一些邻居依赖的分区,可以在每个分区上独立并行地执行挖掘任务。然后,提出了一种基于列的计算方法,以取代耗时的表实例生成来计算模式的流行程度。为了进一步减少每个分区上模式的搜索空间,提出了两种剪枝策略。通过MapReduce实现基于邻域依赖分区和列计算的并行共址模式挖掘算法PCPM-NDPCC。在真实数据集和合成数据集上进行了大量实验,以检验PCPM-NDPCC的性能。实验结果表明,与基准算法相比,PCPM-NDPCC算法的效率有显著提高,并且在处理海量空间数据方面具有更好的可扩展性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Parallel Co-location Pattern Mining based on Neighbor-Dependency Partition and Column Calculation
A co-location pattern is a subset of spatial features whose instances are frequently located together in proximate areas. Mining co-location patterns can discover spatial dependencies in spatial datasets and have particular value in many applications. However, it is challengeable to discover co-location patterns from massive spatial datasets, due to the expensive computational cost. In this paper, we present a novel parallel co-location pattern mining approach. First, dividing spatial neighbor relationships into some neighbor-dependency partitions enables to perform mining task on each partition independently in parallel. Then, a column-based calculation approach is proposed to replace the time-consuming generation of table instances for calculating the prevalence of patterns. To further reduce the search space of patterns on each partition, two pruning strategies are suggested. We implement the parallel co-location pattern mining algorithm based on neighbor-dependency partition and column calculation via MapReduce, named PCPM-NDPCC. Substantial experiments are conducted on real and synthetic datasets to examine the performance of PCPM-NDPCC. Experimental results reveal that PCPM-NDPCC has a significant improvement in efficiency than baseline algorithms and shows better scalability for massive spatial data processing.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信