有效地挖掘范围查询的托管模式

IF 3.5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Big Data Research Pub Date : 2023-02-28 DOI:10.1016/j.bdr.2023.100369

Srikanth Baride , Anuj S. Saxena , Vikram Goyal

{"title":"有效地挖掘范围查询的托管模式","authors":"Srikanth Baride , Anuj S. Saxena , Vikram Goyal","doi":"10.1016/j.bdr.2023.100369","DOIUrl":null,"url":null,"abstract":"<div><p>Colocation pattern mining finds a set of features whose instances frequently appear nearby in the same geographical space. Most of the existing algorithms for colocation patterns find nearby objects by a user-provided single-distance threshold. The value of the distance threshold is data specific and choosing a suitable distance for a user is not easy. In most real-world scenarios, it is rather meant to define spatial proximity by a distance range. It also provides flexibility to observe the change in the colocation patterns with distance and interprets the result better. Algorithms for mining colocations with a single distance threshold cannot be applied directly to the range of distances due to the computational overhead. We identify several structural properties of the collocation patterns and use them to propose an efficient single-pass colocation mining algorithm for distance range query, namely <span><math><mi>R</mi><mi>a</mi><mi>n</mi><mi>g</mi><mi>e</mi><mo>−</mo><mi>C</mi><mi>o</mi><mi>M</mi><mi>i</mi><mi>n</mi><mi>e</mi></math></span>. We compare the performance of the <span><math><mi>R</mi><mi>a</mi><mi>n</mi><mi>g</mi><mi>e</mi><mo>−</mo><mi>C</mi><mi>o</mi><mi>M</mi><mi>i</mi><mi>n</mi><mi>e</mi></math></span> with adapted versions of the famous Join-less colocation mining approach using both real-world and synthetic data sets and show that <span><math><mi>R</mi><mi>a</mi><mi>n</mi><mi>g</mi><mi>e</mi><mo>−</mo><mi>C</mi><mi>o</mi><mi>M</mi><mi>i</mi><mi>n</mi><mi>e</mi></math></span> outperforms the other algorithms.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"31 ","pages":"Article 100369"},"PeriodicalIF":3.5000,"publicationDate":"2023-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Efficiently Mining Colocation Patterns for Range Query\",\"authors\":\"Srikanth Baride , Anuj S. Saxena , Vikram Goyal\",\"doi\":\"10.1016/j.bdr.2023.100369\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Colocation pattern mining finds a set of features whose instances frequently appear nearby in the same geographical space. Most of the existing algorithms for colocation patterns find nearby objects by a user-provided single-distance threshold. The value of the distance threshold is data specific and choosing a suitable distance for a user is not easy. In most real-world scenarios, it is rather meant to define spatial proximity by a distance range. It also provides flexibility to observe the change in the colocation patterns with distance and interprets the result better. Algorithms for mining colocations with a single distance threshold cannot be applied directly to the range of distances due to the computational overhead. We identify several structural properties of the collocation patterns and use them to propose an efficient single-pass colocation mining algorithm for distance range query, namely <span><math><mi>R</mi><mi>a</mi><mi>n</mi><mi>g</mi><mi>e</mi><mo>−</mo><mi>C</mi><mi>o</mi><mi>M</mi><mi>i</mi><mi>n</mi><mi>e</mi></math></span>. We compare the performance of the <span><math><mi>R</mi><mi>a</mi><mi>n</mi><mi>g</mi><mi>e</mi><mo>−</mo><mi>C</mi><mi>o</mi><mi>M</mi><mi>i</mi><mi>n</mi><mi>e</mi></math></span> with adapted versions of the famous Join-less colocation mining approach using both real-world and synthetic data sets and show that <span><math><mi>R</mi><mi>a</mi><mi>n</mi><mi>g</mi><mi>e</mi><mo>−</mo><mi>C</mi><mi>o</mi><mi>M</mi><mi>i</mi><mi>n</mi><mi>e</mi></math></span> outperforms the other algorithms.</p></div>\",\"PeriodicalId\":56017,\"journal\":{\"name\":\"Big Data Research\",\"volume\":\"31 \",\"pages\":\"Article 100369\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2023-02-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Big Data Research\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2214579623000023\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Big Data Research","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2214579623000023","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 2

摘要

并置模式挖掘发现了一组特征，这些特征的实例经常出现在同一地理空间的附近。大多数现有的主机代管模式算法都是通过用户提供的单个距离阈值来找到附近的对象。距离阈值的值是特定于数据的，并且为用户选择合适的距离并不容易。在大多数现实世界的场景中，它更倾向于通过距离范围来定义空间接近度。它还提供了观察主机代管模式随距离变化的灵活性，并更好地解释了结果。由于计算开销，用于挖掘具有单个距离阈值的主机代管的算法不能直接应用于距离范围。我们识别了配置模式的几个结构属性，并利用它们提出了一种有效的距离范围查询的单程配置挖掘算法，即range−CoMine。我们使用真实世界和合成数据集，将Range−CoMine的性能与著名的无连接主机代管挖掘方法的改编版本进行了比较，并表明Range−CoMine优于其他算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Efficiently Mining Colocation Patterns for Range Query

Colocation pattern mining finds a set of features whose instances frequently appear nearby in the same geographical space. Most of the existing algorithms for colocation patterns find nearby objects by a user-provided single-distance threshold. The value of the distance threshold is data specific and choosing a suitable distance for a user is not easy. In most real-world scenarios, it is rather meant to define spatial proximity by a distance range. It also provides flexibility to observe the change in the colocation patterns with distance and interprets the result better. Algorithms for mining colocations with a single distance threshold cannot be applied directly to the range of distances due to the computational overhead. We identify several structural properties of the collocation patterns and use them to propose an efficient single-pass colocation mining algorithm for distance range query, namely $R a n g e - C o M i n e$ . We compare the performance of the $R a n g e - C o M i n e$ with adapted versions of the famous Join-less colocation mining approach using both real-world and synthetic data sets and show that $R a n g e - C o M i n e$ outperforms the other algorithms.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Big Data Research Computer Science-Computer Science Applications

CiteScore

8.40

自引率

3.00%

发文量

期刊介绍： The journal aims to promote and communicate advances in big data research by providing a fast and high quality forum for researchers, practitioners and policy makers from the very many different communities working on, and with, this topic. The journal will accept papers on foundational aspects in dealing with big data, as well as papers on specific Platforms and Technologies used to deal with big data. To promote Data Science and interdisciplinary collaboration between fields, and to showcase the benefits of data driven research, papers demonstrating applications of big data in domains as diverse as Geoscience, Social Web, Finance, e-Commerce, Health Care, Environment and Climate, Physics and Astronomy, Chemistry, life sciences and drug discovery, digital libraries and scientific publications, security and government will also be considered. Occasionally the journal may publish whitepapers on policies, standards and best practices.