Yi-Kun Tang , Heyan Huang , Xuewen Shi , Xian-Ling Mao
{"title":"Bridging insight gaps in topic dependency discovery with a knowledge-inspired topic model","authors":"Yi-Kun Tang , Heyan Huang , Xuewen Shi , Xian-Ling Mao","doi":"10.1016/j.ipm.2024.103911","DOIUrl":null,"url":null,"abstract":"<div><div>Discovering intricate dependencies between topics in topic modeling is challenging due to the noisy and incomplete nature of real-world data and the inherent complexity of topic dependency relationships. In practice, certain basic dependency relationships have been manually annotated and can serve as valuable knowledge resources, enhancing the learning of topic dependencies. To this end, we propose a novel topic model, called Knowledge-Inspired Dependency-Aware Dirichlet Neural Topic Model (KDNTM). Specifically, we first propose Dependency-Aware Dirichlet Neural Topic Model (DepDirNTM), which can discover semantically coherent topics and complex dependencies between these topics from textual data. Then, we propose three methods to leverage accessible external dependency knowledge under the framework of DepDirNTM to enhance the discovery of topic dependencies. Extensive experiments on real-world corpora demonstrate that our models outperform 12 state-of-the-art baselines in terms of topic quality and multi-labeled text classification in most cases, achieving up to a 14% improvement in topic quality over the best baseline. Visualizations of the learned dependency relationships further highlight the benefits of integrating external knowledge, confirming its substantial impact on the effectiveness of topic modeling.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4000,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S030645732400270X","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Discovering intricate dependencies between topics in topic modeling is challenging due to the noisy and incomplete nature of real-world data and the inherent complexity of topic dependency relationships. In practice, certain basic dependency relationships have been manually annotated and can serve as valuable knowledge resources, enhancing the learning of topic dependencies. To this end, we propose a novel topic model, called Knowledge-Inspired Dependency-Aware Dirichlet Neural Topic Model (KDNTM). Specifically, we first propose Dependency-Aware Dirichlet Neural Topic Model (DepDirNTM), which can discover semantically coherent topics and complex dependencies between these topics from textual data. Then, we propose three methods to leverage accessible external dependency knowledge under the framework of DepDirNTM to enhance the discovery of topic dependencies. Extensive experiments on real-world corpora demonstrate that our models outperform 12 state-of-the-art baselines in terms of topic quality and multi-labeled text classification in most cases, achieving up to a 14% improvement in topic quality over the best baseline. Visualizations of the learned dependency relationships further highlight the benefits of integrating external knowledge, confirming its substantial impact on the effectiveness of topic modeling.
期刊介绍:
Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing.
We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.