Chaoyang Zhang , Hang Xue , Kai Nie , Xihui Wu , Zhengzheng Lou , Shouyi Yang , Qinglei Zhou , Shizhe Hu
{"title":"很高兴见到具有大聚类和特征的图像:集群加权多模态共聚方法","authors":"Chaoyang Zhang , Hang Xue , Kai Nie , Xihui Wu , Zhengzheng Lou , Shouyi Yang , Qinglei Zhou , Shizhe Hu","doi":"10.1016/j.ipm.2024.103735","DOIUrl":null,"url":null,"abstract":"<div><p>Multi-modal image clustering focuses on exploring and exploiting related information across various modals of input images to obtain clear image cluster patterns. Recent multi-modal/view clustering methods have shown promising performance in solving the image clustering problem. However, most existing methods fail to properly deal with the multi-modal image data with massive number of clusters and high dimensionality of features in real-world applications, such as image retrieval, multi-modal autonomous driving perception and industrial automation. We call this problem “Big Clusters and Features” for short just as “Big Data” for large number of samples. To fix this challenging problem, we in this article design a general multi-modal image clustering framework which integrates cluster-wise weight learning, feature learning, and clustering structure learning. Under this framework, we further propose a new Cluster-weighted Multi-modal Information Bottleneck Co-clustering (CMIBC) method, and it could effectively measure image cluster importance information and discriminative features of each modal for obtaining satisfactory image clustering performance. Unlike existing cluster weight learning methods only considering intra-cluster similarity or cross-cluster dissimilarity, we design a novel cluster-wise weight learning strategy by jointly considering and enjoying the best of both worlds. Many carefully designed experiments on various multi-modal image datasets with big clusters and features reveal the competitive advantages of the CMIBC algorithm over lots of compared single/multi-modal clustering methods, particularly the notable improvement of 3.12% and 5.28% on the Leaves Plant Species dataset in terms of accuracy and normalized mutual information. Owing to the promising performance, the proposed CMIBC may be extended to many other practical applications, e.g., multi-modal medical analysis and video recognition.</p></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4000,"publicationDate":"2024-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Nice to meet images with Big Clusters and Features: A cluster-weighted multi-modal co-clustering method\",\"authors\":\"Chaoyang Zhang , Hang Xue , Kai Nie , Xihui Wu , Zhengzheng Lou , Shouyi Yang , Qinglei Zhou , Shizhe Hu\",\"doi\":\"10.1016/j.ipm.2024.103735\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Multi-modal image clustering focuses on exploring and exploiting related information across various modals of input images to obtain clear image cluster patterns. Recent multi-modal/view clustering methods have shown promising performance in solving the image clustering problem. However, most existing methods fail to properly deal with the multi-modal image data with massive number of clusters and high dimensionality of features in real-world applications, such as image retrieval, multi-modal autonomous driving perception and industrial automation. We call this problem “Big Clusters and Features” for short just as “Big Data” for large number of samples. To fix this challenging problem, we in this article design a general multi-modal image clustering framework which integrates cluster-wise weight learning, feature learning, and clustering structure learning. Under this framework, we further propose a new Cluster-weighted Multi-modal Information Bottleneck Co-clustering (CMIBC) method, and it could effectively measure image cluster importance information and discriminative features of each modal for obtaining satisfactory image clustering performance. Unlike existing cluster weight learning methods only considering intra-cluster similarity or cross-cluster dissimilarity, we design a novel cluster-wise weight learning strategy by jointly considering and enjoying the best of both worlds. Many carefully designed experiments on various multi-modal image datasets with big clusters and features reveal the competitive advantages of the CMIBC algorithm over lots of compared single/multi-modal clustering methods, particularly the notable improvement of 3.12% and 5.28% on the Leaves Plant Species dataset in terms of accuracy and normalized mutual information. Owing to the promising performance, the proposed CMIBC may be extended to many other practical applications, e.g., multi-modal medical analysis and video recognition.</p></div>\",\"PeriodicalId\":50365,\"journal\":{\"name\":\"Information Processing & Management\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":7.4000,\"publicationDate\":\"2024-05-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Processing & Management\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0306457324000955\",\"RegionNum\":1,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457324000955","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Nice to meet images with Big Clusters and Features: A cluster-weighted multi-modal co-clustering method
Multi-modal image clustering focuses on exploring and exploiting related information across various modals of input images to obtain clear image cluster patterns. Recent multi-modal/view clustering methods have shown promising performance in solving the image clustering problem. However, most existing methods fail to properly deal with the multi-modal image data with massive number of clusters and high dimensionality of features in real-world applications, such as image retrieval, multi-modal autonomous driving perception and industrial automation. We call this problem “Big Clusters and Features” for short just as “Big Data” for large number of samples. To fix this challenging problem, we in this article design a general multi-modal image clustering framework which integrates cluster-wise weight learning, feature learning, and clustering structure learning. Under this framework, we further propose a new Cluster-weighted Multi-modal Information Bottleneck Co-clustering (CMIBC) method, and it could effectively measure image cluster importance information and discriminative features of each modal for obtaining satisfactory image clustering performance. Unlike existing cluster weight learning methods only considering intra-cluster similarity or cross-cluster dissimilarity, we design a novel cluster-wise weight learning strategy by jointly considering and enjoying the best of both worlds. Many carefully designed experiments on various multi-modal image datasets with big clusters and features reveal the competitive advantages of the CMIBC algorithm over lots of compared single/multi-modal clustering methods, particularly the notable improvement of 3.12% and 5.28% on the Leaves Plant Species dataset in terms of accuracy and normalized mutual information. Owing to the promising performance, the proposed CMIBC may be extended to many other practical applications, e.g., multi-modal medical analysis and video recognition.
期刊介绍:
Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing.
We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.