Bridging insight gaps in topic dependency discovery with a knowledge-inspired topic model

IF 7.4 1区 管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Yi-Kun Tang , Heyan Huang , Xuewen Shi , Xian-Ling Mao
{"title":"Bridging insight gaps in topic dependency discovery with a knowledge-inspired topic model","authors":"Yi-Kun Tang ,&nbsp;Heyan Huang ,&nbsp;Xuewen Shi ,&nbsp;Xian-Ling Mao","doi":"10.1016/j.ipm.2024.103911","DOIUrl":null,"url":null,"abstract":"<div><div>Discovering intricate dependencies between topics in topic modeling is challenging due to the noisy and incomplete nature of real-world data and the inherent complexity of topic dependency relationships. In practice, certain basic dependency relationships have been manually annotated and can serve as valuable knowledge resources, enhancing the learning of topic dependencies. To this end, we propose a novel topic model, called Knowledge-Inspired Dependency-Aware Dirichlet Neural Topic Model (KDNTM). Specifically, we first propose Dependency-Aware Dirichlet Neural Topic Model (DepDirNTM), which can discover semantically coherent topics and complex dependencies between these topics from textual data. Then, we propose three methods to leverage accessible external dependency knowledge under the framework of DepDirNTM to enhance the discovery of topic dependencies. Extensive experiments on real-world corpora demonstrate that our models outperform 12 state-of-the-art baselines in terms of topic quality and multi-labeled text classification in most cases, achieving up to a 14% improvement in topic quality over the best baseline. Visualizations of the learned dependency relationships further highlight the benefits of integrating external knowledge, confirming its substantial impact on the effectiveness of topic modeling.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4000,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S030645732400270X","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Discovering intricate dependencies between topics in topic modeling is challenging due to the noisy and incomplete nature of real-world data and the inherent complexity of topic dependency relationships. In practice, certain basic dependency relationships have been manually annotated and can serve as valuable knowledge resources, enhancing the learning of topic dependencies. To this end, we propose a novel topic model, called Knowledge-Inspired Dependency-Aware Dirichlet Neural Topic Model (KDNTM). Specifically, we first propose Dependency-Aware Dirichlet Neural Topic Model (DepDirNTM), which can discover semantically coherent topics and complex dependencies between these topics from textual data. Then, we propose three methods to leverage accessible external dependency knowledge under the framework of DepDirNTM to enhance the discovery of topic dependencies. Extensive experiments on real-world corpora demonstrate that our models outperform 12 state-of-the-art baselines in terms of topic quality and multi-labeled text classification in most cases, achieving up to a 14% improvement in topic quality over the best baseline. Visualizations of the learned dependency relationships further highlight the benefits of integrating external knowledge, confirming its substantial impact on the effectiveness of topic modeling.
用知识启发的话题模型弥合话题依赖发现中的洞察力差距
在主题建模中,发现主题之间错综复杂的依赖关系具有挑战性,这是因为现实世界的数据具有噪声和不完整性,而且主题依赖关系本身也很复杂。在实践中,某些基本的依赖关系已经过人工标注,可以作为宝贵的知识资源,加强对主题依赖关系的学习。为此,我们提出了一种新颖的主题模型,即知识启发的依赖关系感知 Dirichlet 神经主题模型(Knowledge-Inspired Dependency-Aware Dirichlet Neural Topic Model,KDNTM)。具体来说,我们首先提出了 "依赖感知 Dirichlet 神经主题模型"(DepDirNTM),它可以从文本数据中发现语义一致的主题以及这些主题之间复杂的依赖关系。然后,我们提出了三种方法,在 DepDirNTM 框架下利用可获取的外部依赖关系知识来增强主题依赖关系的发现。在真实世界语料库上进行的广泛实验表明,我们的模型在大多数情况下在主题质量和多标签文本分类方面都优于 12 个最先进的基线模型,在主题质量方面比最佳基线模型最多提高了 14%。学习到的依赖关系可视化进一步突出了整合外部知识的优势,证实了外部知识对主题建模效果的重大影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Information Processing & Management
Information Processing & Management 工程技术-计算机:信息系统
CiteScore
17.00
自引率
11.60%
发文量
276
审稿时长
39 days
期刊介绍: Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing. We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信