Debiasing weighted multi-view k-means clustering based on causal regularization

IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Xiuqi Huang, Hong Tao, Haotian Ni, Chenping Hou
{"title":"Debiasing weighted multi-view k-means clustering based on causal regularization","authors":"Xiuqi Huang,&nbsp;Hong Tao,&nbsp;Haotian Ni,&nbsp;Chenping Hou","doi":"10.1016/j.patcog.2024.111195","DOIUrl":null,"url":null,"abstract":"<div><div>In the field of unsupervised learning, many methods such as clustering rely on exploring the correlations among features. However, considering these correlations is not always advantageous for learning models. The biased selection of data may lead to redundant and unstable correlations among features, adversely affecting the performance of learning models. Multi-view data presents more complex feature correlations with potential redundancy and varying distributions across views, necessitating detailed analysis. This paper proposes a causal regularized debiased multi-view k-means clustering (DMKC) method to counteract redundant feature correlations stemming from sample selection bias. This method introduces a covariate weighted balance method from causal inference to mitigate redundant bias in multi-view clustering by adjusting sample weights. The approach combines sample and view weights within a k-means loss framework, effectively eliminating feature redundancy and enhancing clustering performance amidst sample selection bias. The optimization process of the relevant parameters is detailed in this paper, and comprehensive experiments demonstrate the effectiveness of the method.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"160 ","pages":"Article 111195"},"PeriodicalIF":7.5000,"publicationDate":"2024-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320324009464","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

In the field of unsupervised learning, many methods such as clustering rely on exploring the correlations among features. However, considering these correlations is not always advantageous for learning models. The biased selection of data may lead to redundant and unstable correlations among features, adversely affecting the performance of learning models. Multi-view data presents more complex feature correlations with potential redundancy and varying distributions across views, necessitating detailed analysis. This paper proposes a causal regularized debiased multi-view k-means clustering (DMKC) method to counteract redundant feature correlations stemming from sample selection bias. This method introduces a covariate weighted balance method from causal inference to mitigate redundant bias in multi-view clustering by adjusting sample weights. The approach combines sample and view weights within a k-means loss framework, effectively eliminating feature redundancy and enhancing clustering performance amidst sample selection bias. The optimization process of the relevant parameters is detailed in this paper, and comprehensive experiments demonstrate the effectiveness of the method.
基于因果正则化的去锯齿加权多视图 K 均值聚类法
在无监督学习领域,聚类等许多方法都依赖于探索特征之间的相关性。然而,考虑这些相关性并不总是对学习模型有利。数据选择的偏差可能会导致特征间冗余和不稳定的相关性,从而对学习模型的性能产生不利影响。多视图数据具有更复杂的特征相关性,可能存在冗余,而且不同视图之间的分布各不相同,因此有必要进行详细分析。本文提出了一种因果正则化去偏多视图 K 均值聚类(DMKC)方法,以抵消因样本选择偏差而产生的冗余特征相关性。该方法从因果推理中引入了一种协变量加权平衡方法,通过调整样本权重来减轻多视图聚类中的冗余偏差。该方法将样本权重和视图权重结合到 k-means 损失框架中,有效消除了特征冗余,并在样本选择偏差中提高了聚类性能。本文详细介绍了相关参数的优化过程,并通过综合实验证明了该方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Pattern Recognition
Pattern Recognition 工程技术-工程:电子与电气
CiteScore
14.40
自引率
16.20%
发文量
683
审稿时长
5.6 months
期刊介绍: The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信