An Efficient Nonnegative Matrix Factorization Topic Modeling for Business Intelligence

K. PrashantGokul, M. Sundararajan
{"title":"An Efficient Nonnegative Matrix Factorization Topic Modeling for Business Intelligence","authors":"K. PrashantGokul, M. Sundararajan","doi":"10.4108/EAI.7-6-2021.2308681","DOIUrl":null,"url":null,"abstract":". Topic models can give us a knowledge into the basic latent design of an enormous corpus of documents. A scope of strategies have been planned in the writing, including probabilistic topic models and methods dependent on matrix factorization. Notwithstanding, the subsequent topics frequently address just broad, in this manner excess information about the data instead of minor, yet possibly significant information to clients. To handle this issue, we propose a novel sparseness improvement model of negative matrix factorization for finding excellent nearby topics. In any case, in the two cases, standard executions depend on stochastic components in their instatement stage, which can possibly prompt various outcomes being produced on a similar corpus when utilizing a similar boundary values. To address this issue in the context of matrix factorization for topic modeling, we propose the utilization of ensemble learning procedures. We show the useful utility of ENMF on New York Times dataset, and find that ENMF is particularly helpful for applied or expansive topics, where topic key terms are not surely known. We find that ENMF accomplishes higher weighted Jaccard similarity scores than the contemporary strategies..","PeriodicalId":422301,"journal":{"name":"Proceedings of the First International Conference on Computing, Communication and Control System, I3CAC 2021, 7-8 June 2021, Bharath University, Chennai, India","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the First International Conference on Computing, Communication and Control System, I3CAC 2021, 7-8 June 2021, Bharath University, Chennai, India","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4108/EAI.7-6-2021.2308681","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

. Topic models can give us a knowledge into the basic latent design of an enormous corpus of documents. A scope of strategies have been planned in the writing, including probabilistic topic models and methods dependent on matrix factorization. Notwithstanding, the subsequent topics frequently address just broad, in this manner excess information about the data instead of minor, yet possibly significant information to clients. To handle this issue, we propose a novel sparseness improvement model of negative matrix factorization for finding excellent nearby topics. In any case, in the two cases, standard executions depend on stochastic components in their instatement stage, which can possibly prompt various outcomes being produced on a similar corpus when utilizing a similar boundary values. To address this issue in the context of matrix factorization for topic modeling, we propose the utilization of ensemble learning procedures. We show the useful utility of ENMF on New York Times dataset, and find that ENMF is particularly helpful for applied or expansive topics, where topic key terms are not surely known. We find that ENMF accomplishes higher weighted Jaccard similarity scores than the contemporary strategies..
面向商业智能的高效非负矩阵分解主题建模
. 主题模型可以让我们了解海量文档语料库的基本潜在设计。在写作中已经规划了一系列策略,包括概率主题模型和依赖于矩阵分解的方法。尽管如此,后面的主题通常只讨论关于数据的广泛的、过多的信息,而不是次要的、但对客户来说可能很重要的信息。为了解决这一问题,我们提出了一种新的稀疏性改进的负矩阵分解模型,用于寻找优秀的附近主题。在任何情况下,在这两种情况下,标准执行依赖于安装阶段的随机组件,当使用相似的边界值时,可能会在类似的语料库上产生不同的结果。为了在主题建模的矩阵分解背景下解决这个问题,我们建议使用集成学习过程。我们在《纽约时报》数据集上展示了ENMF的有用功能,并发现ENMF对应用或扩展主题特别有帮助,其中主题关键术语并不确定。我们发现,与当代策略相比,ENMF实现了更高的加权Jaccard相似性得分。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信