Le Li, Jianjun Yang, Yang Xu, Zhen Qin, Honggang Zhang
{"title":"基于最大熵非负矩阵分解的文档聚类","authors":"Le Li, Jianjun Yang, Yang Xu, Zhen Qin, Honggang Zhang","doi":"10.1109/ICMLC.2014.7009720","DOIUrl":null,"url":null,"abstract":"Nonnegative matrix factorization (NMF) has been successfully applied to many areas of both classification and clustering. Commonly used NMF algorithms mainly target on minimizing the l2 distance or the Kullback-Leibler (KL) divergence, which may not be suitable for nonlinear cases. In this paper, we propose a new decomposition method by maximizing the correntropy between the original and the product of two low-rank matrices for document clustering. This method also allows us to learn new basis vectors of the semantic feature space from data. To our knowledge, there is no existing work which clusters high dimensional document data by maximizing the correntropy in NMF. Our experimental results show the supremacy of the proposed method over other variants of NMF algorithms on Reuters21578 and TDT2 databasets.","PeriodicalId":335296,"journal":{"name":"2014 International Conference on Machine Learning and Cybernetics","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":"{\"title\":\"Documents clustering based on max-correntropy nonnegative matrix factorization\",\"authors\":\"Le Li, Jianjun Yang, Yang Xu, Zhen Qin, Honggang Zhang\",\"doi\":\"10.1109/ICMLC.2014.7009720\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Nonnegative matrix factorization (NMF) has been successfully applied to many areas of both classification and clustering. Commonly used NMF algorithms mainly target on minimizing the l2 distance or the Kullback-Leibler (KL) divergence, which may not be suitable for nonlinear cases. In this paper, we propose a new decomposition method by maximizing the correntropy between the original and the product of two low-rank matrices for document clustering. This method also allows us to learn new basis vectors of the semantic feature space from data. To our knowledge, there is no existing work which clusters high dimensional document data by maximizing the correntropy in NMF. Our experimental results show the supremacy of the proposed method over other variants of NMF algorithms on Reuters21578 and TDT2 databasets.\",\"PeriodicalId\":335296,\"journal\":{\"name\":\"2014 International Conference on Machine Learning and Cybernetics\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-07-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"16\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 International Conference on Machine Learning and Cybernetics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLC.2014.7009720\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 International Conference on Machine Learning and Cybernetics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLC.2014.7009720","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Documents clustering based on max-correntropy nonnegative matrix factorization
Nonnegative matrix factorization (NMF) has been successfully applied to many areas of both classification and clustering. Commonly used NMF algorithms mainly target on minimizing the l2 distance or the Kullback-Leibler (KL) divergence, which may not be suitable for nonlinear cases. In this paper, we propose a new decomposition method by maximizing the correntropy between the original and the product of two low-rank matrices for document clustering. This method also allows us to learn new basis vectors of the semantic feature space from data. To our knowledge, there is no existing work which clusters high dimensional document data by maximizing the correntropy in NMF. Our experimental results show the supremacy of the proposed method over other variants of NMF algorithms on Reuters21578 and TDT2 databasets.