关键词辅助主题模型

IF 5 1区 社会学 Q1 POLITICAL SCIENCE
Shusei Eshima, Kosuke Imai, Tomoya Sasaki
{"title":"关键词辅助主题模型","authors":"Shusei Eshima,&nbsp;Kosuke Imai,&nbsp;Tomoya Sasaki","doi":"10.1111/ajps.12779","DOIUrl":null,"url":null,"abstract":"<p>In recent years, fully automated content analysis based on probabilistic topic models has become popular among social scientists because of their scalability. However, researchers find that these models often fail to measure specific concepts of substantive interest by inadvertently creating multiple topics with similar content and combining distinct themes into a single topic. In this article, we empirically demonstrate that providing a small number of keywords can substantially enhance the measurement performance of topic models. An important advantage of the proposed keyword-assisted topic model (keyATM) is that the specification of keywords requires researchers to label topics prior to fitting a model to the data. This contrasts with a widespread practice of post hoc topic interpretation and adjustments that compromises the objectivity of empirical findings. In our application, we find that keyATM provides more interpretable results, has better document classification performance, and is less sensitive to the number of topics.</p>","PeriodicalId":48447,"journal":{"name":"American Journal of Political Science","volume":null,"pages":null},"PeriodicalIF":5.0000,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Keyword-Assisted Topic Models\",\"authors\":\"Shusei Eshima,&nbsp;Kosuke Imai,&nbsp;Tomoya Sasaki\",\"doi\":\"10.1111/ajps.12779\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>In recent years, fully automated content analysis based on probabilistic topic models has become popular among social scientists because of their scalability. However, researchers find that these models often fail to measure specific concepts of substantive interest by inadvertently creating multiple topics with similar content and combining distinct themes into a single topic. In this article, we empirically demonstrate that providing a small number of keywords can substantially enhance the measurement performance of topic models. An important advantage of the proposed keyword-assisted topic model (keyATM) is that the specification of keywords requires researchers to label topics prior to fitting a model to the data. This contrasts with a widespread practice of post hoc topic interpretation and adjustments that compromises the objectivity of empirical findings. In our application, we find that keyATM provides more interpretable results, has better document classification performance, and is less sensitive to the number of topics.</p>\",\"PeriodicalId\":48447,\"journal\":{\"name\":\"American Journal of Political Science\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2023-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"American Journal of Political Science\",\"FirstCategoryId\":\"90\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/ajps.12779\",\"RegionNum\":1,\"RegionCategory\":\"社会学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"POLITICAL SCIENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"American Journal of Political Science","FirstCategoryId":"90","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/ajps.12779","RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"POLITICAL SCIENCE","Score":null,"Total":0}
引用次数: 0

摘要

近年来,基于概率主题模型的全自动内容分析因其可扩展性而受到社会科学家的青睐。然而,研究人员发现,这些模型经常会无意中创建多个内容相似的主题,并将不同的主题合并为一个主题,从而无法衡量实质性的特定概念。在本文中,我们通过实证证明,提供少量关键词就能大大提高主题模型的测量性能。所提出的关键词辅助主题模型(keyATM)的一个重要优势是,关键词的指定要求研究人员在对数据拟合模型之前标注主题。这与普遍存在的事后对主题进行解释和调整的做法形成了鲜明对比,这种做法损害了实证研究结果的客观性。在我们的应用中,我们发现 keyATM 提供了更多可解释的结果,具有更好的文档分类性能,而且对主题数量的敏感度较低。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Keyword-Assisted Topic Models

In recent years, fully automated content analysis based on probabilistic topic models has become popular among social scientists because of their scalability. However, researchers find that these models often fail to measure specific concepts of substantive interest by inadvertently creating multiple topics with similar content and combining distinct themes into a single topic. In this article, we empirically demonstrate that providing a small number of keywords can substantially enhance the measurement performance of topic models. An important advantage of the proposed keyword-assisted topic model (keyATM) is that the specification of keywords requires researchers to label topics prior to fitting a model to the data. This contrasts with a widespread practice of post hoc topic interpretation and adjustments that compromises the objectivity of empirical findings. In our application, we find that keyATM provides more interpretable results, has better document classification performance, and is less sensitive to the number of topics.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
9.30
自引率
2.40%
发文量
61
期刊介绍: The American Journal of Political Science (AJPS) publishes research in all major areas of political science including American politics, public policy, international relations, comparative politics, political methodology, and political theory. Founded in 1956, the AJPS publishes articles that make outstanding contributions to scholarly knowledge about notable theoretical concerns, puzzles or controversies in any subfield of political science.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信