在COVID-19大流行期间揭开Twitter关于口罩的话语:基于用户集群的BERT主题建模方法。

IF 3.5 Q1 HEALTH CARE SCIENCES & SERVICES

JMIR infodemiology Pub Date : 2022-07-01 DOI:10.2196/41198

Weiai Wayne Xu, Jean Marie Tshimula, Ève Dubé, Janice E Graham, Devon Greyson, Noni E MacDonald, Samantha B Meyer

{"title":"在COVID-19大流行期间揭开Twitter关于口罩的话语:基于用户集群的BERT主题建模方法。","authors":"Weiai Wayne Xu, Jean Marie Tshimula, Ève Dubé, Janice E Graham, Devon Greyson, Noni E MacDonald, Samantha B Meyer","doi":"10.2196/41198","DOIUrl":null,"url":null,"abstract":"Background: The COVID-19 pandemic has spotlighted the politicization of public health issues. A public health monitoring tool must be equipped to reveal a public health measure's political context and guide better interventions. In its current form, infoveillance tends to neglect identity and interest-based users, hence being limited in exposing how public health discourse varies by different political groups. Adopting an algorithmic tool to classify users and their short social media texts might remedy that limitation.Objective: We aimed to implement a new computational framework to investigate discourses and temporal changes in topics unique to different user clusters. The framework was developed to contextualize how web-based public health discourse varies by identity and interest-based user clusters. We used masks and mask wearing during the early stage of the COVID-19 pandemic in the English-speaking world as a case study to illustrate the application of the framework.Methods: We first clustered Twitter users based on their identities and interests as expressed through Twitter bio pages. Exploratory text network analysis reveals salient political, social, and professional identities of various user clusters. It then uses BERT Topic modeling to identify topics by the user clusters. It reveals how web-based discourse has shifted over time and varied by 4 user clusters: conservative, progressive, general public, and public health professionals.Results: This study demonstrated the importance of a priori user classification and longitudinal topical trends in understanding the political context of web-based public health discourse. The framework reveals that the political groups and the general public focused on the science of mask wearing and the partisan politics of mask policies. A populist discourse that pits citizens against elites and institutions was identified in some tweets. Politicians (such as Donald Trump) and geopolitical tensions with China were found to drive the discourse. It also shows limited participation of public health professionals compared with other users.Conclusions: We conclude by discussing the importance of a priori user classification in analyzing web-based discourse and illustrating the fit of BERT Topic modeling in identifying contextualized topics in short social media texts.","PeriodicalId":73554,"journal":{"name":"JMIR infodemiology","volume":"2 2","pages":"e41198"},"PeriodicalIF":3.5000,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9749113/pdf/","citationCount":"1","resultStr":"{\"title\":\"Unmasking the Twitter Discourses on Masks During the COVID-19 Pandemic: User Cluster-Based BERT Topic Modeling Approach.\",\"authors\":\"Weiai Wayne Xu, Jean Marie Tshimula, Ève Dubé, Janice E Graham, Devon Greyson, Noni E MacDonald, Samantha B Meyer\",\"doi\":\"10.2196/41198\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: The COVID-19 pandemic has spotlighted the politicization of public health issues. A public health monitoring tool must be equipped to reveal a public health measure's political context and guide better interventions. In its current form, infoveillance tends to neglect identity and interest-based users, hence being limited in exposing how public health discourse varies by different political groups. Adopting an algorithmic tool to classify users and their short social media texts might remedy that limitation.Objective: We aimed to implement a new computational framework to investigate discourses and temporal changes in topics unique to different user clusters. The framework was developed to contextualize how web-based public health discourse varies by identity and interest-based user clusters. We used masks and mask wearing during the early stage of the COVID-19 pandemic in the English-speaking world as a case study to illustrate the application of the framework.Methods: We first clustered Twitter users based on their identities and interests as expressed through Twitter bio pages. Exploratory text network analysis reveals salient political, social, and professional identities of various user clusters. It then uses BERT Topic modeling to identify topics by the user clusters. It reveals how web-based discourse has shifted over time and varied by 4 user clusters: conservative, progressive, general public, and public health professionals.Results: This study demonstrated the importance of a priori user classification and longitudinal topical trends in understanding the political context of web-based public health discourse. The framework reveals that the political groups and the general public focused on the science of mask wearing and the partisan politics of mask policies. A populist discourse that pits citizens against elites and institutions was identified in some tweets. Politicians (such as Donald Trump) and geopolitical tensions with China were found to drive the discourse. It also shows limited participation of public health professionals compared with other users.Conclusions: We conclude by discussing the importance of a priori user classification in analyzing web-based discourse and illustrating the fit of BERT Topic modeling in identifying contextualized topics in short social media texts.\",\"PeriodicalId\":73554,\"journal\":{\"name\":\"JMIR infodemiology\",\"volume\":\"2 2\",\"pages\":\"e41198\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2022-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9749113/pdf/\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JMIR infodemiology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2196/41198\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR infodemiology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/41198","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 1

摘要

背景:2019冠状病毒病大流行凸显了公共卫生问题的政治化。必须配备公共卫生监测工具，以揭示公共卫生措施的政治背景，并指导更好的干预措施。以目前的形式，信息监测往往忽视基于身份和兴趣的用户，因此在揭示公共卫生话语如何因不同的政治群体而变化方面受到限制。采用一种算法工具对用户及其简短的社交媒体文本进行分类，可能会弥补这一限制。目的:我们旨在实现一个新的计算框架来研究不同用户群特有的主题的话语和时间变化。开发该框架的目的是将基于网络的公共卫生话语如何因身份和基于兴趣的用户群而变化。我们以英语国家新冠肺炎大流行初期的口罩和口罩佩戴情况为例，说明该框架的应用。方法:我们首先根据Twitter个人主页上的身份和兴趣对Twitter用户进行聚类。探索性文本网络分析揭示了不同用户群的显著政治、社会和职业身份。然后，它使用BERT Topic建模来根据用户集群识别主题。它揭示了基于网络的话语如何随着时间的推移而变化，并根据4个用户群而变化:保守派、进步派、普通公众和公共卫生专业人员。结果:本研究证明了先验用户分类和纵向主题趋势在理解基于网络的公共卫生话语的政治背景中的重要性。该框架表明，政治团体和普通大众关注的是戴口罩的科学和口罩政策的党派政治。在一些推文中，人们发现了一种让公民对抗精英和机构的民粹主义言论。研究发现，政治人物(如唐纳德•特朗普)和与中国的地缘政治紧张关系推动了这种言论。它还显示，与其他用户相比，公共卫生专业人员的参与有限。结论:我们最后讨论了先验用户分类在分析基于网络的话语中的重要性，并说明了BERT主题建模在识别短社交媒体文本中的情境化主题方面的适合性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Unmasking the Twitter Discourses on Masks During the COVID-19 Pandemic: User Cluster-Based BERT Topic Modeling Approach.

查看原文本刊更多论文

Unmasking the Twitter Discourses on Masks During the COVID-19 Pandemic: User Cluster-Based BERT Topic Modeling Approach.

Background: The COVID-19 pandemic has spotlighted the politicization of public health issues. A public health monitoring tool must be equipped to reveal a public health measure's political context and guide better interventions. In its current form, infoveillance tends to neglect identity and interest-based users, hence being limited in exposing how public health discourse varies by different political groups. Adopting an algorithmic tool to classify users and their short social media texts might remedy that limitation.

Objective: We aimed to implement a new computational framework to investigate discourses and temporal changes in topics unique to different user clusters. The framework was developed to contextualize how web-based public health discourse varies by identity and interest-based user clusters. We used masks and mask wearing during the early stage of the COVID-19 pandemic in the English-speaking world as a case study to illustrate the application of the framework.

Methods: We first clustered Twitter users based on their identities and interests as expressed through Twitter bio pages. Exploratory text network analysis reveals salient political, social, and professional identities of various user clusters. It then uses BERT Topic modeling to identify topics by the user clusters. It reveals how web-based discourse has shifted over time and varied by 4 user clusters: conservative, progressive, general public, and public health professionals.

Results: This study demonstrated the importance of a priori user classification and longitudinal topical trends in understanding the political context of web-based public health discourse. The framework reveals that the political groups and the general public focused on the science of mask wearing and the partisan politics of mask policies. A populist discourse that pits citizens against elites and institutions was identified in some tweets. Politicians (such as Donald Trump) and geopolitical tensions with China were found to drive the discourse. It also shows limited participation of public health professionals compared with other users.

Conclusions: We conclude by discussing the importance of a priori user classification in analyzing web-based discourse and illustrating the fit of BERT Topic modeling in identifying contextualized topics in short social media texts.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

JMIR infodemiology

CiteScore

4.80

自引率

0.00%

发文量