Using Bagging to improve clustering methods in the context of three-dimensional shapes

IF 1.4 4区 计算机科学 Q2 STATISTICS & PROBABILITY
Inácio Nascimento, Raydonal Ospina, Getúlio Amorim
{"title":"Using Bagging to improve clustering methods in the context of three-dimensional shapes","authors":"Inácio Nascimento, Raydonal Ospina, Getúlio Amorim","doi":"10.1007/s11634-024-00602-9","DOIUrl":null,"url":null,"abstract":"<p>Cluster Analysis techniques are a common approach to classifying objects within a dataset into distinct clusters. The clustering of geometric shapes of objects holds significant importance in various fields of study. To analyze the geometric shapes of objects, researchers often employ Statistical Shape Analysis methods, which retain crucial information after accounting for scaling, locating, and rotating an object. Consequently, several researchers have focused on adapting clustering algorithms for shape analysis. Recently, three-dimensional (3D) shape clustering has become crucial for analyzing, interpreting, and effectively utilizing 3D data across diverse industries, including medicine, robotics, civil engineering, and paleontology. In this study, we adapt the <i>K-means</i>, <i>CLARANS</i> and <i>Hill Climbing</i> methods using an approach based on the <i>Bagging</i> procedure to achieve enhanced clustering accuracy. We conduct simulation experiments for both isotropy and anisotropy scenarios, considering various dispersion variations. Furthermore, we apply the proposed approach to real datasets from relevant literature. We evaluate the obtained clusters using cluster validation measures, specifically the Rand Index and the Fowlkes-Mallows Index. Our results demonstrate substantial improvements in clustering quality when implementing the <i>Bagging</i> approach in conjunction with the <i>K-means</i>, <i>CLARANS</i> and <i>Hill Climbing</i> methods. The combination of the Bagging method and clustering algorithms provided substantial gains in the quality of the clusters.</p>","PeriodicalId":49270,"journal":{"name":"Advances in Data Analysis and Classification","volume":"58 1","pages":""},"PeriodicalIF":1.4000,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in Data Analysis and Classification","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11634-024-00602-9","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0

Abstract

Cluster Analysis techniques are a common approach to classifying objects within a dataset into distinct clusters. The clustering of geometric shapes of objects holds significant importance in various fields of study. To analyze the geometric shapes of objects, researchers often employ Statistical Shape Analysis methods, which retain crucial information after accounting for scaling, locating, and rotating an object. Consequently, several researchers have focused on adapting clustering algorithms for shape analysis. Recently, three-dimensional (3D) shape clustering has become crucial for analyzing, interpreting, and effectively utilizing 3D data across diverse industries, including medicine, robotics, civil engineering, and paleontology. In this study, we adapt the K-means, CLARANS and Hill Climbing methods using an approach based on the Bagging procedure to achieve enhanced clustering accuracy. We conduct simulation experiments for both isotropy and anisotropy scenarios, considering various dispersion variations. Furthermore, we apply the proposed approach to real datasets from relevant literature. We evaluate the obtained clusters using cluster validation measures, specifically the Rand Index and the Fowlkes-Mallows Index. Our results demonstrate substantial improvements in clustering quality when implementing the Bagging approach in conjunction with the K-means, CLARANS and Hill Climbing methods. The combination of the Bagging method and clustering algorithms provided substantial gains in the quality of the clusters.

Abstract Image

在三维图形中使用套袋法改进聚类方法
聚类分析技术是将数据集中的对象划分为不同群组的常用方法。物体几何形状的聚类在各个研究领域都具有重要意义。为了分析物体的几何形状,研究人员通常采用统计形状分析方法,这种方法在对物体进行缩放、定位和旋转后,仍能保留关键信息。因此,一些研究人员专注于将聚类算法应用于形状分析。最近,三维(3D)形状聚类对于分析、解释和有效利用各行各业(包括医学、机器人、土木工程和古生物学)的三维数据变得至关重要。在本研究中,我们使用基于袋式程序的方法对 K-means、CLARANS 和 Hill Climbing 方法进行了调整,以实现更高的聚类精度。我们针对各向同性和各向异性情况进行了模拟实验,并考虑了各种分散变化。此外,我们还将提议的方法应用于相关文献中的真实数据集。我们使用聚类验证方法,特别是兰德指数和福克斯-马洛斯指数,对获得的聚类进行评估。我们的研究结果表明,当 Bagging 方法与 K-means、CLARANS 和 Hill Climbing 方法结合使用时,聚类质量得到了大幅提高。Bagging 方法与聚类算法的结合大大提高了聚类的质量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
3.40
自引率
6.20%
发文量
45
审稿时长
>12 weeks
期刊介绍: The international journal Advances in Data Analysis and Classification (ADAC) is designed as a forum for high standard publications on research and applications concerning the extraction of knowable aspects from many types of data. It publishes articles on such topics as structural, quantitative, or statistical approaches for the analysis of data; advances in classification, clustering, and pattern recognition methods; strategies for modeling complex data and mining large data sets; methods for the extraction of knowledge from data, and applications of advanced methods in specific domains of practice. Articles illustrate how new domain-specific knowledge can be made available from data by skillful use of data analysis methods. The journal also publishes survey papers that outline, and illuminate the basic ideas and techniques of special approaches.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信