Topics, Concepts, and Measurement: A Crowdsourced Procedure for Validating Topics as Measures

IF 4.7 2区 社会学 Q1 POLITICAL SCIENCE
Luwei Ying, J. Montgomery, Brandon M Stewart
{"title":"Topics, Concepts, and Measurement: A Crowdsourced Procedure for Validating Topics as Measures","authors":"Luwei Ying, J. Montgomery, Brandon M Stewart","doi":"10.1017/pan.2021.33","DOIUrl":null,"url":null,"abstract":"Abstract Topic models, as developed in computer science, are effective tools for exploring and summarizing large document collections. When applied in social science research, however, they are commonly used for measurement, a task that requires careful validation to ensure that the model outputs actually capture the desired concept of interest. In this paper, we review current practices for topic validation in the field and show that extensive model validation is increasingly rare, or at least not systematically reported in papers and appendices. To supplement current practices, we refine an existing crowd-sourcing method by Chang and coauthors for validating topic quality and go on to create new procedures for validating conceptual labels provided by the researcher. We illustrate our method with an analysis of Facebook posts by U.S. Senators and provide software and guidance for researchers wishing to validate their own topic models. While tailored, case-specific validation exercises will always be best, we aim to improve standard practices by providing a general-purpose tool to validate topics as measures.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":null,"pages":null},"PeriodicalIF":4.7000,"publicationDate":"2021-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Political Analysis","FirstCategoryId":"90","ListUrlMain":"https://doi.org/10.1017/pan.2021.33","RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"POLITICAL SCIENCE","Score":null,"Total":0}
引用次数: 22

Abstract

Abstract Topic models, as developed in computer science, are effective tools for exploring and summarizing large document collections. When applied in social science research, however, they are commonly used for measurement, a task that requires careful validation to ensure that the model outputs actually capture the desired concept of interest. In this paper, we review current practices for topic validation in the field and show that extensive model validation is increasingly rare, or at least not systematically reported in papers and appendices. To supplement current practices, we refine an existing crowd-sourcing method by Chang and coauthors for validating topic quality and go on to create new procedures for validating conceptual labels provided by the researcher. We illustrate our method with an analysis of Facebook posts by U.S. Senators and provide software and guidance for researchers wishing to validate their own topic models. While tailored, case-specific validation exercises will always be best, we aim to improve standard practices by providing a general-purpose tool to validate topics as measures.
主题、概念和度量:验证主题作为度量的众包程序
摘要主题模型是在计算机科学中发展起来的,是探索和总结大型文档集的有效工具。然而,当应用于社会科学研究时,它们通常用于测量,这项任务需要仔细验证,以确保模型输出实际捕捉到所需的兴趣概念。在本文中,我们回顾了该领域主题验证的当前实践,并表明广泛的模型验证越来越罕见,或者至少在论文和附录中没有系统地报告。为了补充当前的实践,我们改进了Chang和合著者现有的众包方法,以验证主题质量,并继续创建新的程序来验证研究人员提供的概念标签。我们通过分析美国参议员在脸书上的帖子来说明我们的方法,并为希望验证自己的主题模型的研究人员提供软件和指导。虽然量身定制的、针对具体案例的验证练习总是最好的,但我们的目标是通过提供一个通用工具来验证主题作为衡量标准来改进标准实践。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Political Analysis
Political Analysis POLITICAL SCIENCE-
CiteScore
8.80
自引率
3.70%
发文量
30
期刊介绍: Political Analysis chronicles these exciting developments by publishing the most sophisticated scholarship in the field. It is the place to learn new methods, to find some of the best empirical scholarship, and to publish your best research.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信