{"title":"Unifying complete and incomplete multi-view clustering through an information-theoretic generative model","authors":"Yanghang Zheng , Guoxu Zhou , Haonan Huang , Xintao Luo , Zhenhao Huang , Qibin Zhao","doi":"10.1016/j.neunet.2024.106901","DOIUrl":null,"url":null,"abstract":"<div><div>Recently, Incomplete Multi-View Clustering (IMVC) has become a rapidly growing research topic, driven by the prevalent issue of incomplete data in real-world applications. Although many approaches have been proposed to address this challenge, most methods did not provide a clear explanation of the learning process for recovery. Moreover, most of them only considered the inter-view relationships, without taking into account the relationships between samples. The influence of irrelevant information is usually ignored, which has prevented them from achieving optimal performance. To tackle the aforementioned issues, we aim at unifying compLete and incOmplete multi-view clusterinG through an Information-theoretiC generative model (LOGIC). Specifically, we have defined three principles based on information theory: comprehensiveness, consensus, and compressibility. We first explain that the essence of learning to recover missing views is to maximize the mutual information between the common representation and the data from each view. Secondly, we leverage the consensus principle to maximize the mutual information between view distributions to uncover the associations between different samples. Finally, guided by the principle of compressibility, we remove as much task-irrelevant information as possible to ensure that the common representation effectively extracts semantic information. Furthermore, it can serve as a plug-and-play missing-data recovery module for multi-view clustering models. Through extensive empirical studies, we have demonstrated the effectiveness of our approach in generating missing views. In clustering tasks, our method consistently outperforms state-of-the-art (SOTA) techniques in terms of accuracy, normalized mutual information and purity, showcasing its superiority in both recovery and clustering performance.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"182 ","pages":"Article 106901"},"PeriodicalIF":6.0000,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S089360802400830X","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Recently, Incomplete Multi-View Clustering (IMVC) has become a rapidly growing research topic, driven by the prevalent issue of incomplete data in real-world applications. Although many approaches have been proposed to address this challenge, most methods did not provide a clear explanation of the learning process for recovery. Moreover, most of them only considered the inter-view relationships, without taking into account the relationships between samples. The influence of irrelevant information is usually ignored, which has prevented them from achieving optimal performance. To tackle the aforementioned issues, we aim at unifying compLete and incOmplete multi-view clusterinG through an Information-theoretiC generative model (LOGIC). Specifically, we have defined three principles based on information theory: comprehensiveness, consensus, and compressibility. We first explain that the essence of learning to recover missing views is to maximize the mutual information between the common representation and the data from each view. Secondly, we leverage the consensus principle to maximize the mutual information between view distributions to uncover the associations between different samples. Finally, guided by the principle of compressibility, we remove as much task-irrelevant information as possible to ensure that the common representation effectively extracts semantic information. Furthermore, it can serve as a plug-and-play missing-data recovery module for multi-view clustering models. Through extensive empirical studies, we have demonstrated the effectiveness of our approach in generating missing views. In clustering tasks, our method consistently outperforms state-of-the-art (SOTA) techniques in terms of accuracy, normalized mutual information and purity, showcasing its superiority in both recovery and clustering performance.
期刊介绍:
Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.