Parameter-free discrete clustering via adaptive hypergraph fusion

IF 6.8 1区 计算机科学 0 COMPUTER SCIENCE, INFORMATION SYSTEMS
Yu Zhou , Ben Yang , Xuetao Zhang , Badong Chen
{"title":"Parameter-free discrete clustering via adaptive hypergraph fusion","authors":"Yu Zhou ,&nbsp;Ben Yang ,&nbsp;Xuetao Zhang ,&nbsp;Badong Chen","doi":"10.1016/j.ins.2025.122677","DOIUrl":null,"url":null,"abstract":"<div><div>Graph-based clustering has garnered significant attention due to its outstanding performance in uncovering sample structures. However, existing graph-based methods face two major challenges: 1) In graph construction, they typically focus only on direct connections between samples or an exact high-order relationship, neglecting the impact of hidden complex relationships on clustering performance; 2) The separation of spectral analysis and category acquisition into two distinct stages often results in a loss of effectiveness. To handle these problems, we propose a parameter-free discrete clustering method, called parameter-free discrete clustering via adaptive hypergraph fusion (DCAHF). Specifically, DCAHF first produces multiple different hypergraphs, each serving as a biased approximation of the data's intrinsic manifold. These complementary approximations capture distinct local-to-global geometric patterns. Then, it introduces an adaptive fusion strategy that learns optimal weights to combine them into a single consensus hypergraph on manifold space, effectively reconstructing the real manifold structure with reduced bias and improved integrity. Finally, discrete spectral analysis is performed directly on the consensus hypergraph to generate discrete sample categories, thereby avoiding the performance loss associated with two-stage approaches. Thus, DCAHF is a high-performance, parameter-free clustering model that can flexibly adapt to various clustering tasks. Since the DCAHF model cannot be solved using gradient descent methods, we develop a coordinate descent-based optimization algorithm to efficiently solve the model. Extensive experimental results demonstrate that DCAHF significantly enhances clustering effectiveness while maintaining comparable efficiency to state-of-the-art methods.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"723 ","pages":"Article 122677"},"PeriodicalIF":6.8000,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Sciences","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0020025525008102","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Graph-based clustering has garnered significant attention due to its outstanding performance in uncovering sample structures. However, existing graph-based methods face two major challenges: 1) In graph construction, they typically focus only on direct connections between samples or an exact high-order relationship, neglecting the impact of hidden complex relationships on clustering performance; 2) The separation of spectral analysis and category acquisition into two distinct stages often results in a loss of effectiveness. To handle these problems, we propose a parameter-free discrete clustering method, called parameter-free discrete clustering via adaptive hypergraph fusion (DCAHF). Specifically, DCAHF first produces multiple different hypergraphs, each serving as a biased approximation of the data's intrinsic manifold. These complementary approximations capture distinct local-to-global geometric patterns. Then, it introduces an adaptive fusion strategy that learns optimal weights to combine them into a single consensus hypergraph on manifold space, effectively reconstructing the real manifold structure with reduced bias and improved integrity. Finally, discrete spectral analysis is performed directly on the consensus hypergraph to generate discrete sample categories, thereby avoiding the performance loss associated with two-stage approaches. Thus, DCAHF is a high-performance, parameter-free clustering model that can flexibly adapt to various clustering tasks. Since the DCAHF model cannot be solved using gradient descent methods, we develop a coordinate descent-based optimization algorithm to efficiently solve the model. Extensive experimental results demonstrate that DCAHF significantly enhances clustering effectiveness while maintaining comparable efficiency to state-of-the-art methods.

Abstract Image

基于自适应超图融合的无参数离散聚类
基于图的聚类由于其在揭示样本结构方面的突出表现而引起了人们的广泛关注。然而,现有的基于图的聚类方法面临两个主要挑战:1)在图的构建中,它们通常只关注样本之间的直接联系或精确的高阶关系,而忽略了隐藏的复杂关系对聚类性能的影响;2)将光谱分析和类别获取分离为两个不同的阶段往往会导致有效性的损失。为了解决这些问题,我们提出了一种无参数离散聚类方法,称为自适应超图融合无参数离散聚类(DCAHF)。具体地说,DCAHF首先产生多个不同的超图,每个超图都作为数据内在流形的有偏近似。这些互补的近似捕获不同的局部到全局的几何模式。然后,引入一种自适应融合策略,学习最优权值,将它们组合成流形空间上的单个一致性超图,有效地重建真实流形结构,降低了偏差,提高了完整性。最后,直接在一致性超图上执行离散谱分析以生成离散样本类别,从而避免了与两阶段方法相关的性能损失。因此,DCAHF是一种高性能、无参数的聚类模型,可以灵活地适应各种聚类任务。由于DCAHF模型不能用梯度下降法求解,我们开发了一种基于坐标下降的优化算法来有效求解该模型。大量的实验结果表明,DCAHF显著提高了聚类效率,同时保持了与最先进方法相当的效率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Information Sciences
Information Sciences 工程技术-计算机:信息系统
CiteScore
14.00
自引率
17.30%
发文量
1322
审稿时长
10.4 months
期刊介绍: Informatics and Computer Science Intelligent Systems Applications is an esteemed international journal that focuses on publishing original and creative research findings in the field of information sciences. We also feature a limited number of timely tutorial and surveying contributions. Our journal aims to cater to a diverse audience, including researchers, developers, managers, strategic planners, graduate students, and anyone interested in staying up-to-date with cutting-edge research in information science, knowledge engineering, and intelligent systems. While readers are expected to share a common interest in information science, they come from varying backgrounds such as engineering, mathematics, statistics, physics, computer science, cell biology, molecular biology, management science, cognitive science, neurobiology, behavioral sciences, and biochemistry.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信