基于最大熵非负矩阵分解的文档聚类

2014 International Conference on Machine Learning and Cybernetics Pub Date : 2014-07-13 DOI:10.1109/ICMLC.2014.7009720

Le Li, Jianjun Yang, Yang Xu, Zhen Qin, Honggang Zhang

{"title":"基于最大熵非负矩阵分解的文档聚类","authors":"Le Li, Jianjun Yang, Yang Xu, Zhen Qin, Honggang Zhang","doi":"10.1109/ICMLC.2014.7009720","DOIUrl":null,"url":null,"abstract":"Nonnegative matrix factorization (NMF) has been successfully applied to many areas of both classification and clustering. Commonly used NMF algorithms mainly target on minimizing the l2 distance or the Kullback-Leibler (KL) divergence, which may not be suitable for nonlinear cases. In this paper, we propose a new decomposition method by maximizing the correntropy between the original and the product of two low-rank matrices for document clustering. This method also allows us to learn new basis vectors of the semantic feature space from data. To our knowledge, there is no existing work which clusters high dimensional document data by maximizing the correntropy in NMF. Our experimental results show the supremacy of the proposed method over other variants of NMF algorithms on Reuters21578 and TDT2 databasets.","PeriodicalId":335296,"journal":{"name":"2014 International Conference on Machine Learning and Cybernetics","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":"{\"title\":\"Documents clustering based on max-correntropy nonnegative matrix factorization\",\"authors\":\"Le Li, Jianjun Yang, Yang Xu, Zhen Qin, Honggang Zhang\",\"doi\":\"10.1109/ICMLC.2014.7009720\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Nonnegative matrix factorization (NMF) has been successfully applied to many areas of both classification and clustering. Commonly used NMF algorithms mainly target on minimizing the l2 distance or the Kullback-Leibler (KL) divergence, which may not be suitable for nonlinear cases. In this paper, we propose a new decomposition method by maximizing the correntropy between the original and the product of two low-rank matrices for document clustering. This method also allows us to learn new basis vectors of the semantic feature space from data. To our knowledge, there is no existing work which clusters high dimensional document data by maximizing the correntropy in NMF. Our experimental results show the supremacy of the proposed method over other variants of NMF algorithms on Reuters21578 and TDT2 databasets.\",\"PeriodicalId\":335296,\"journal\":{\"name\":\"2014 International Conference on Machine Learning and Cybernetics\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-07-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"16\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 International Conference on Machine Learning and Cybernetics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLC.2014.7009720\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 International Conference on Machine Learning and Cybernetics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLC.2014.7009720","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 16

摘要

非负矩阵分解(NMF)已成功地应用于分类和聚类的许多领域。常用的NMF算法主要以最小化l2距离或Kullback-Leibler (KL)散度为目标，这可能不适用于非线性情况。在本文中，我们提出了一种新的分解方法，通过最大化两个低秩矩阵的原始矩阵和乘积之间的相关系数来进行文档聚类。该方法还允许我们从数据中学习新的语义特征空间基向量。据我们所知，目前还没有在NMF中利用最大熵对高维文档数据进行聚类的研究。实验结果表明，该方法在路透社21578和TDT2数据库上优于其他NMF算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Documents clustering based on max-correntropy nonnegative matrix factorization

Nonnegative matrix factorization (NMF) has been successfully applied to many areas of both classification and clustering. Commonly used NMF algorithms mainly target on minimizing the l2 distance or the Kullback-Leibler (KL) divergence, which may not be suitable for nonlinear cases. In this paper, we propose a new decomposition method by maximizing the correntropy between the original and the product of two low-rank matrices for document clustering. This method also allows us to learn new basis vectors of the semantic feature space from data. To our knowledge, there is no existing work which clusters high dimensional document data by maximizing the correntropy in NMF. Our experimental results show the supremacy of the proposed method over other variants of NMF algorithms on Reuters21578 and TDT2 databasets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2014 International Conference on Machine Learning and Cybernetics

自引率

0.00%

发文量