BGICR: Bootstrap-guided iterative clustering refinement for enhanced high-dimensional psychological data analysis

IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Khoula Al. Abri, Manjit Singh Sidhu, Faridah Hani Mohamed Salleh
{"title":"BGICR: Bootstrap-guided iterative clustering refinement for enhanced high-dimensional psychological data analysis","authors":"Khoula Al. Abri,&nbsp;Manjit Singh Sidhu,&nbsp;Faridah Hani Mohamed Salleh","doi":"10.1016/j.knosys.2025.114724","DOIUrl":null,"url":null,"abstract":"<div><div>High-dimensional psychological data poses challenges due to noise, overlap, and projection distortion. This study presents Bootstrap-Guided Iterative Clustering Refinement (BGICR), a new framework developed to improve clustering in reduced-dimensional spaces. The proposed method uses silhouette-guided filtering and bootstrap sampling to iteratively remove ambiguous points through structural denoising, and it monitors validation scores until convergence. We used real-world psychological assessment data and applied four dimensionality reduction techniques: t-distributed stochastic neighbour embedding, uniform manifold approximation and projection, isometric mapping, and kernel principal component analysis. Results showed that BGICR consistently outperformed conventional clustering pipelines, with uniform manifold approximation and projection yielding the most distinct and well-separated clusters. Through adaptive iterations, the refinement improved the silhouette score to 0.7405, reduced the Davies–Bouldin index to 0.3914, increased the Calinski–Harabasz score to 3755.08, and achieved a Dunn index of 0.7689. Additional validation on synthetic data (Two-Moons) and biomedical datasets (LC25000 histopathological images) confirmed improved clustering quality with stable convergence and efficient runtime. Taken together, these results establish BGICR as a statistically grounded, noise-sensitive, and generalizable method for high-dimensional data analysis across psychological, synthetic, and biomedical domains.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"330 ","pages":"Article 114724"},"PeriodicalIF":7.6000,"publicationDate":"2025-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705125017630","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

High-dimensional psychological data poses challenges due to noise, overlap, and projection distortion. This study presents Bootstrap-Guided Iterative Clustering Refinement (BGICR), a new framework developed to improve clustering in reduced-dimensional spaces. The proposed method uses silhouette-guided filtering and bootstrap sampling to iteratively remove ambiguous points through structural denoising, and it monitors validation scores until convergence. We used real-world psychological assessment data and applied four dimensionality reduction techniques: t-distributed stochastic neighbour embedding, uniform manifold approximation and projection, isometric mapping, and kernel principal component analysis. Results showed that BGICR consistently outperformed conventional clustering pipelines, with uniform manifold approximation and projection yielding the most distinct and well-separated clusters. Through adaptive iterations, the refinement improved the silhouette score to 0.7405, reduced the Davies–Bouldin index to 0.3914, increased the Calinski–Harabasz score to 3755.08, and achieved a Dunn index of 0.7689. Additional validation on synthetic data (Two-Moons) and biomedical datasets (LC25000 histopathological images) confirmed improved clustering quality with stable convergence and efficient runtime. Taken together, these results establish BGICR as a statistically grounded, noise-sensitive, and generalizable method for high-dimensional data analysis across psychological, synthetic, and biomedical domains.
BGICR:用于增强高维心理数据分析的bootstrap引导迭代聚类改进
由于噪声、重叠和投影失真,高维心理数据面临挑战。本文提出了Bootstrap-Guided Iterative Clustering Refinement (BGICR),这是一种改进降维空间聚类的新框架。该方法采用轮廓引导滤波和自举采样,通过结构去噪迭代去除模糊点,并监测验证分数直到收敛。我们使用真实世界的心理评估数据,并应用了四种降维技术:t分布随机邻居嵌入、均匀流形逼近和投影、等距映射和核主成分分析。结果表明,BGICR始终优于传统的聚类管道,均匀的流形近似和投影产生最明显和分离良好的聚类。通过自适应迭代,改进后的剪影评分为0.7405,Davies-Bouldin指数为0.3914,Calinski-Harabasz指数为3755.08,Dunn指数为0.7689。在合成数据(Two-Moons)和生物医学数据集(LC25000组织病理学图像)上的进一步验证证实了聚类质量的提高,收敛稳定,运行效率高。综上所述,这些结果使BGICR成为一种基于统计的、噪声敏感的、可推广的方法,可用于心理、合成和生物医学领域的高维数据分析。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Knowledge-Based Systems
Knowledge-Based Systems 工程技术-计算机:人工智能
CiteScore
14.80
自引率
12.50%
发文量
1245
审稿时长
7.8 months
期刊介绍: Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信