{"title":"An Improved Grid Clustering Algorithm for Geographic Data Mining","authors":"Honglei He","doi":"10.1111/exsy.70042","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Grid clustering is a classical clustering algorithm with the advantage of lower time complexity, which is suitable for the analysis of large geographic data. However, it is sensitive to the grid division parameter <i>M</i> and density threshold <i>R</i>, and the clustering accuracy is poor. The article proposes a hybrid clustering algorithm HCA-BGP based on grid and division. the algorithm first uses grid clustering to obtain the core part of the class family, and then uses the division-based method to obtain the edge part of the class family. Through experiments on simulated datasets and real geographic datasets, it is proved to have better results than the existing grid clustering as well as some other classical algorithms. In terms of clustering accuracy, compared with the classical grid clustering algorithm Clique, the clustering F-value of this paper's algorithm is improved by 20.3% on dataset S1, 81.8% on dataset R15, and 7.6% on average on the eight geographic datasets. In terms of the sensitivity of parameters <i>M</i> and <i>R</i>, compared with Clique, the variance of the clustered F-value of this paper's algorithm is reduced by 89.3% on dataset S1; the variance of the clustered ARI is reduced by 99.9% on the real geographic dataset Data8. Compared to another grid-based clustering algorithm, GDB, HCA-BGP also demonstrates significant advantages.</p>\n </div>","PeriodicalId":51053,"journal":{"name":"Expert Systems","volume":"42 5","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/exsy.70042","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
摘要
网格聚类是一种经典的聚类算法,具有时间复杂度较低的优点,适用于大型地理数据的分析。但它对网格划分参数 M 和密度阈值 R 比较敏感,聚类精度较差。本文提出了一种基于网格和划分的混合聚类算法 HCA-BGP。该算法首先使用网格聚类得到类族的核心部分,然后使用基于划分的方法得到类族的边缘部分。通过模拟数据集和真实地理数据集的实验,证明该算法比现有的网格聚类以及其他一些经典算法效果更好。在聚类精度方面,与经典网格聚类算法 Clique 相比,本文算法的聚类 F 值在数据集 S1 上提高了 20.3%,在数据集 R15 上提高了 81.8%,在 8 个地理数据集上平均提高了 7.6%。在参数 M 和 R 的灵敏度方面,与 Clique 相比,本文算法的聚类 F 值方差在数据集 S1 上降低了 89.3%;聚类 ARI 方差在实际地理数据集 Data8 上降低了 99.9%。与另一种基于网格的聚类算法 GDB 相比,HCA-BGP 也具有显著优势。
An Improved Grid Clustering Algorithm for Geographic Data Mining
Grid clustering is a classical clustering algorithm with the advantage of lower time complexity, which is suitable for the analysis of large geographic data. However, it is sensitive to the grid division parameter M and density threshold R, and the clustering accuracy is poor. The article proposes a hybrid clustering algorithm HCA-BGP based on grid and division. the algorithm first uses grid clustering to obtain the core part of the class family, and then uses the division-based method to obtain the edge part of the class family. Through experiments on simulated datasets and real geographic datasets, it is proved to have better results than the existing grid clustering as well as some other classical algorithms. In terms of clustering accuracy, compared with the classical grid clustering algorithm Clique, the clustering F-value of this paper's algorithm is improved by 20.3% on dataset S1, 81.8% on dataset R15, and 7.6% on average on the eight geographic datasets. In terms of the sensitivity of parameters M and R, compared with Clique, the variance of the clustered F-value of this paper's algorithm is reduced by 89.3% on dataset S1; the variance of the clustered ARI is reduced by 99.9% on the real geographic dataset Data8. Compared to another grid-based clustering algorithm, GDB, HCA-BGP also demonstrates significant advantages.
期刊介绍:
Expert Systems: The Journal of Knowledge Engineering publishes papers dealing with all aspects of knowledge engineering, including individual methods and techniques in knowledge acquisition and representation, and their application in the construction of systems – including expert systems – based thereon. Detailed scientific evaluation is an essential part of any paper.
As well as traditional application areas, such as Software and Requirements Engineering, Human-Computer Interaction, and Artificial Intelligence, we are aiming at the new and growing markets for these technologies, such as Business, Economy, Market Research, and Medical and Health Care. The shift towards this new focus will be marked by a series of special issues covering hot and emergent topics.