{"title":"BGICR: Bootstrap-guided iterative clustering refinement for enhanced high-dimensional psychological data analysis","authors":"Khoula Al. Abri, Manjit Singh Sidhu, Faridah Hani Mohamed Salleh","doi":"10.1016/j.knosys.2025.114724","DOIUrl":null,"url":null,"abstract":"<div><div>High-dimensional psychological data poses challenges due to noise, overlap, and projection distortion. This study presents Bootstrap-Guided Iterative Clustering Refinement (BGICR), a new framework developed to improve clustering in reduced-dimensional spaces. The proposed method uses silhouette-guided filtering and bootstrap sampling to iteratively remove ambiguous points through structural denoising, and it monitors validation scores until convergence. We used real-world psychological assessment data and applied four dimensionality reduction techniques: t-distributed stochastic neighbour embedding, uniform manifold approximation and projection, isometric mapping, and kernel principal component analysis. Results showed that BGICR consistently outperformed conventional clustering pipelines, with uniform manifold approximation and projection yielding the most distinct and well-separated clusters. Through adaptive iterations, the refinement improved the silhouette score to 0.7405, reduced the Davies–Bouldin index to 0.3914, increased the Calinski–Harabasz score to 3755.08, and achieved a Dunn index of 0.7689. Additional validation on synthetic data (Two-Moons) and biomedical datasets (LC25000 histopathological images) confirmed improved clustering quality with stable convergence and efficient runtime. Taken together, these results establish BGICR as a statistically grounded, noise-sensitive, and generalizable method for high-dimensional data analysis across psychological, synthetic, and biomedical domains.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"330 ","pages":"Article 114724"},"PeriodicalIF":7.6000,"publicationDate":"2025-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705125017630","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
High-dimensional psychological data poses challenges due to noise, overlap, and projection distortion. This study presents Bootstrap-Guided Iterative Clustering Refinement (BGICR), a new framework developed to improve clustering in reduced-dimensional spaces. The proposed method uses silhouette-guided filtering and bootstrap sampling to iteratively remove ambiguous points through structural denoising, and it monitors validation scores until convergence. We used real-world psychological assessment data and applied four dimensionality reduction techniques: t-distributed stochastic neighbour embedding, uniform manifold approximation and projection, isometric mapping, and kernel principal component analysis. Results showed that BGICR consistently outperformed conventional clustering pipelines, with uniform manifold approximation and projection yielding the most distinct and well-separated clusters. Through adaptive iterations, the refinement improved the silhouette score to 0.7405, reduced the Davies–Bouldin index to 0.3914, increased the Calinski–Harabasz score to 3755.08, and achieved a Dunn index of 0.7689. Additional validation on synthetic data (Two-Moons) and biomedical datasets (LC25000 histopathological images) confirmed improved clustering quality with stable convergence and efficient runtime. Taken together, these results establish BGICR as a statistically grounded, noise-sensitive, and generalizable method for high-dimensional data analysis across psychological, synthetic, and biomedical domains.
期刊介绍:
Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.