A systematic review of the use of topic models for short text social media analysis

IF 10.7 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Artificial Intelligence Review Pub Date : 2023-05-01 DOI:10.1007/s10462-023-10471-x

Caitlin Doogan Poet Laureate, Wray Buntine, Henry Linger

{"title":"A systematic review of the use of topic models for short text social media analysis","authors":"Caitlin Doogan Poet Laureate, Wray Buntine, Henry Linger","doi":"10.1007/s10462-023-10471-x","DOIUrl":null,"url":null,"abstract":"<div><p>Recently, research on short text topic models has addressed the challenges of social media datasets. These models are typically evaluated using automated measures. However, recent work suggests that these evaluation measures do not inform whether the topics produced can yield meaningful insights for those examining social media data. Efforts to address this issue, including gauging the alignment between automated and human evaluation tasks, are hampered by a lack of knowledge about how researchers use topic models. Further problems could arise if researchers do not construct topic models optimally or use them in a way that exceeds the models’ limitations. These scenarios threaten the validity of topic model development and the insights produced by researchers employing topic modelling as a methodology. However, there is currently a lack of information about how and why topic models are used in applied research. As such, we performed a systematic literature review of 189 articles where topic modelling was used for social media analysis to understand how and why topic models are used for social media analysis. Our results suggest that the development of topic models is not aligned with the needs of those who use them for social media analysis. We have found that researchers use topic models sub-optimally. There is a lack of methodological support for researchers to build and interpret topics. We offer a set of recommendations for topic model researchers to address these problems and bridge the gap between development and applied research on short text topic models.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"56 12","pages":"14223 - 14255"},"PeriodicalIF":10.7000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-023-10471-x.pdf","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence Review","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10462-023-10471-x","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 2

Abstract

Recently, research on short text topic models has addressed the challenges of social media datasets. These models are typically evaluated using automated measures. However, recent work suggests that these evaluation measures do not inform whether the topics produced can yield meaningful insights for those examining social media data. Efforts to address this issue, including gauging the alignment between automated and human evaluation tasks, are hampered by a lack of knowledge about how researchers use topic models. Further problems could arise if researchers do not construct topic models optimally or use them in a way that exceeds the models’ limitations. These scenarios threaten the validity of topic model development and the insights produced by researchers employing topic modelling as a methodology. However, there is currently a lack of information about how and why topic models are used in applied research. As such, we performed a systematic literature review of 189 articles where topic modelling was used for social media analysis to understand how and why topic models are used for social media analysis. Our results suggest that the development of topic models is not aligned with the needs of those who use them for social media analysis. We have found that researchers use topic models sub-optimally. There is a lack of methodological support for researchers to build and interpret topics. We offer a set of recommendations for topic model researchers to address these problems and bridge the gap between development and applied research on short text topic models.

Abstract Image

查看原文本刊更多论文

对话题模型用于短文本社交媒体分析的系统综述。

最近，对短文本主题模型的研究解决了社交媒体数据集的挑战。这些模型通常使用自动化度量进行评估。然而，最近的研究表明，这些评估措施并不能说明所产生的主题是否能为那些研究社交媒体数据的人带来有意义的见解。由于缺乏研究人员如何使用主题模型的知识，解决这一问题的努力，包括衡量自动化和人工评估任务之间的一致性，受到了阻碍。如果研究人员没有以最佳方式构建主题模型或以超出模型限制的方式使用主题模型，可能会出现进一步的问题。这些场景威胁到主题模型开发的有效性，以及研究人员将主题建模作为一种方法所产生的见解。然而，目前缺乏关于如何以及为什么在应用研究中使用主题模型的信息。因此，我们对189篇文章进行了系统的文献综述，其中主题模型被用于社交媒体分析，以了解主题模型如何以及为什么被用于社交媒介分析。我们的研究结果表明，主题模型的开发与那些使用它们进行社交媒体分析的人的需求不一致。我们发现，研究人员使用主题模型的效果并不理想。缺乏对研究人员构建和解释主题的方法支持。我们为主题模型研究人员提供了一系列建议，以解决这些问题，并弥合短文本主题模型的开发和应用研究之间的差距。补充信息：在线版本包含补充材料，网址为10.1007/s10462-023-10471-x。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Artificial Intelligence Review 工程技术-计算机：人工智能

CiteScore

22.00

自引率

3.30%

发文量

194

审稿时长

5.3 months

期刊介绍： Artificial Intelligence Review, a fully open access journal, publishes cutting-edge research in artificial intelligence and cognitive science. It features critical evaluations of applications, techniques, and algorithms, providing a platform for both researchers and application developers. The journal includes refereed survey and tutorial articles, along with reviews and commentary on significant developments in the field.