{"title":"在三维图形中使用套袋法改进聚类方法","authors":"Inácio Nascimento, Raydonal Ospina, Getúlio Amorim","doi":"10.1007/s11634-024-00602-9","DOIUrl":null,"url":null,"abstract":"<p>Cluster Analysis techniques are a common approach to classifying objects within a dataset into distinct clusters. The clustering of geometric shapes of objects holds significant importance in various fields of study. To analyze the geometric shapes of objects, researchers often employ Statistical Shape Analysis methods, which retain crucial information after accounting for scaling, locating, and rotating an object. Consequently, several researchers have focused on adapting clustering algorithms for shape analysis. Recently, three-dimensional (3D) shape clustering has become crucial for analyzing, interpreting, and effectively utilizing 3D data across diverse industries, including medicine, robotics, civil engineering, and paleontology. In this study, we adapt the <i>K-means</i>, <i>CLARANS</i> and <i>Hill Climbing</i> methods using an approach based on the <i>Bagging</i> procedure to achieve enhanced clustering accuracy. We conduct simulation experiments for both isotropy and anisotropy scenarios, considering various dispersion variations. Furthermore, we apply the proposed approach to real datasets from relevant literature. We evaluate the obtained clusters using cluster validation measures, specifically the Rand Index and the Fowlkes-Mallows Index. Our results demonstrate substantial improvements in clustering quality when implementing the <i>Bagging</i> approach in conjunction with the <i>K-means</i>, <i>CLARANS</i> and <i>Hill Climbing</i> methods. The combination of the Bagging method and clustering algorithms provided substantial gains in the quality of the clusters.</p>","PeriodicalId":49270,"journal":{"name":"Advances in Data Analysis and Classification","volume":"58 1","pages":""},"PeriodicalIF":1.4000,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Using Bagging to improve clustering methods in the context of three-dimensional shapes\",\"authors\":\"Inácio Nascimento, Raydonal Ospina, Getúlio Amorim\",\"doi\":\"10.1007/s11634-024-00602-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Cluster Analysis techniques are a common approach to classifying objects within a dataset into distinct clusters. The clustering of geometric shapes of objects holds significant importance in various fields of study. To analyze the geometric shapes of objects, researchers often employ Statistical Shape Analysis methods, which retain crucial information after accounting for scaling, locating, and rotating an object. Consequently, several researchers have focused on adapting clustering algorithms for shape analysis. Recently, three-dimensional (3D) shape clustering has become crucial for analyzing, interpreting, and effectively utilizing 3D data across diverse industries, including medicine, robotics, civil engineering, and paleontology. In this study, we adapt the <i>K-means</i>, <i>CLARANS</i> and <i>Hill Climbing</i> methods using an approach based on the <i>Bagging</i> procedure to achieve enhanced clustering accuracy. We conduct simulation experiments for both isotropy and anisotropy scenarios, considering various dispersion variations. Furthermore, we apply the proposed approach to real datasets from relevant literature. We evaluate the obtained clusters using cluster validation measures, specifically the Rand Index and the Fowlkes-Mallows Index. Our results demonstrate substantial improvements in clustering quality when implementing the <i>Bagging</i> approach in conjunction with the <i>K-means</i>, <i>CLARANS</i> and <i>Hill Climbing</i> methods. The combination of the Bagging method and clustering algorithms provided substantial gains in the quality of the clusters.</p>\",\"PeriodicalId\":49270,\"journal\":{\"name\":\"Advances in Data Analysis and Classification\",\"volume\":\"58 1\",\"pages\":\"\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2024-08-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Advances in Data Analysis and Classification\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s11634-024-00602-9\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in Data Analysis and Classification","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11634-024-00602-9","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0
摘要
聚类分析技术是将数据集中的对象划分为不同群组的常用方法。物体几何形状的聚类在各个研究领域都具有重要意义。为了分析物体的几何形状,研究人员通常采用统计形状分析方法,这种方法在对物体进行缩放、定位和旋转后,仍能保留关键信息。因此,一些研究人员专注于将聚类算法应用于形状分析。最近,三维(3D)形状聚类对于分析、解释和有效利用各行各业(包括医学、机器人、土木工程和古生物学)的三维数据变得至关重要。在本研究中,我们使用基于袋式程序的方法对 K-means、CLARANS 和 Hill Climbing 方法进行了调整,以实现更高的聚类精度。我们针对各向同性和各向异性情况进行了模拟实验,并考虑了各种分散变化。此外,我们还将提议的方法应用于相关文献中的真实数据集。我们使用聚类验证方法,特别是兰德指数和福克斯-马洛斯指数,对获得的聚类进行评估。我们的研究结果表明,当 Bagging 方法与 K-means、CLARANS 和 Hill Climbing 方法结合使用时,聚类质量得到了大幅提高。Bagging 方法与聚类算法的结合大大提高了聚类的质量。
Using Bagging to improve clustering methods in the context of three-dimensional shapes
Cluster Analysis techniques are a common approach to classifying objects within a dataset into distinct clusters. The clustering of geometric shapes of objects holds significant importance in various fields of study. To analyze the geometric shapes of objects, researchers often employ Statistical Shape Analysis methods, which retain crucial information after accounting for scaling, locating, and rotating an object. Consequently, several researchers have focused on adapting clustering algorithms for shape analysis. Recently, three-dimensional (3D) shape clustering has become crucial for analyzing, interpreting, and effectively utilizing 3D data across diverse industries, including medicine, robotics, civil engineering, and paleontology. In this study, we adapt the K-means, CLARANS and Hill Climbing methods using an approach based on the Bagging procedure to achieve enhanced clustering accuracy. We conduct simulation experiments for both isotropy and anisotropy scenarios, considering various dispersion variations. Furthermore, we apply the proposed approach to real datasets from relevant literature. We evaluate the obtained clusters using cluster validation measures, specifically the Rand Index and the Fowlkes-Mallows Index. Our results demonstrate substantial improvements in clustering quality when implementing the Bagging approach in conjunction with the K-means, CLARANS and Hill Climbing methods. The combination of the Bagging method and clustering algorithms provided substantial gains in the quality of the clusters.
期刊介绍:
The international journal Advances in Data Analysis and Classification (ADAC) is designed as a forum for high standard publications on research and applications concerning the extraction of knowable aspects from many types of data. It publishes articles on such topics as structural, quantitative, or statistical approaches for the analysis of data; advances in classification, clustering, and pattern recognition methods; strategies for modeling complex data and mining large data sets; methods for the extraction of knowledge from data, and applications of advanced methods in specific domains of practice. Articles illustrate how new domain-specific knowledge can be made available from data by skillful use of data analysis methods. The journal also publishes survey papers that outline, and illuminate the basic ideas and techniques of special approaches.