{"title":"Parameter-free discrete clustering via adaptive hypergraph fusion","authors":"Yu Zhou , Ben Yang , Xuetao Zhang , Badong Chen","doi":"10.1016/j.ins.2025.122677","DOIUrl":null,"url":null,"abstract":"<div><div>Graph-based clustering has garnered significant attention due to its outstanding performance in uncovering sample structures. However, existing graph-based methods face two major challenges: 1) In graph construction, they typically focus only on direct connections between samples or an exact high-order relationship, neglecting the impact of hidden complex relationships on clustering performance; 2) The separation of spectral analysis and category acquisition into two distinct stages often results in a loss of effectiveness. To handle these problems, we propose a parameter-free discrete clustering method, called parameter-free discrete clustering via adaptive hypergraph fusion (DCAHF). Specifically, DCAHF first produces multiple different hypergraphs, each serving as a biased approximation of the data's intrinsic manifold. These complementary approximations capture distinct local-to-global geometric patterns. Then, it introduces an adaptive fusion strategy that learns optimal weights to combine them into a single consensus hypergraph on manifold space, effectively reconstructing the real manifold structure with reduced bias and improved integrity. Finally, discrete spectral analysis is performed directly on the consensus hypergraph to generate discrete sample categories, thereby avoiding the performance loss associated with two-stage approaches. Thus, DCAHF is a high-performance, parameter-free clustering model that can flexibly adapt to various clustering tasks. Since the DCAHF model cannot be solved using gradient descent methods, we develop a coordinate descent-based optimization algorithm to efficiently solve the model. Extensive experimental results demonstrate that DCAHF significantly enhances clustering effectiveness while maintaining comparable efficiency to state-of-the-art methods.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"723 ","pages":"Article 122677"},"PeriodicalIF":6.8000,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Sciences","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0020025525008102","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Graph-based clustering has garnered significant attention due to its outstanding performance in uncovering sample structures. However, existing graph-based methods face two major challenges: 1) In graph construction, they typically focus only on direct connections between samples or an exact high-order relationship, neglecting the impact of hidden complex relationships on clustering performance; 2) The separation of spectral analysis and category acquisition into two distinct stages often results in a loss of effectiveness. To handle these problems, we propose a parameter-free discrete clustering method, called parameter-free discrete clustering via adaptive hypergraph fusion (DCAHF). Specifically, DCAHF first produces multiple different hypergraphs, each serving as a biased approximation of the data's intrinsic manifold. These complementary approximations capture distinct local-to-global geometric patterns. Then, it introduces an adaptive fusion strategy that learns optimal weights to combine them into a single consensus hypergraph on manifold space, effectively reconstructing the real manifold structure with reduced bias and improved integrity. Finally, discrete spectral analysis is performed directly on the consensus hypergraph to generate discrete sample categories, thereby avoiding the performance loss associated with two-stage approaches. Thus, DCAHF is a high-performance, parameter-free clustering model that can flexibly adapt to various clustering tasks. Since the DCAHF model cannot be solved using gradient descent methods, we develop a coordinate descent-based optimization algorithm to efficiently solve the model. Extensive experimental results demonstrate that DCAHF significantly enhances clustering effectiveness while maintaining comparable efficiency to state-of-the-art methods.
期刊介绍:
Informatics and Computer Science Intelligent Systems Applications is an esteemed international journal that focuses on publishing original and creative research findings in the field of information sciences. We also feature a limited number of timely tutorial and surveying contributions.
Our journal aims to cater to a diverse audience, including researchers, developers, managers, strategic planners, graduate students, and anyone interested in staying up-to-date with cutting-edge research in information science, knowledge engineering, and intelligent systems. While readers are expected to share a common interest in information science, they come from varying backgrounds such as engineering, mathematics, statistics, physics, computer science, cell biology, molecular biology, management science, cognitive science, neurobiology, behavioral sciences, and biochemistry.