What Are People Concerned About During the Pandemic? Detecting Evolving Topics about COVID-19 from Twitter.

IF 5.9 Q1 Computer Science
Chia-Hsuan Chang, Michal Monselise, Christopher C Yang
{"title":"What Are People Concerned About During the Pandemic? Detecting Evolving Topics about COVID-19 from Twitter.","authors":"Chia-Hsuan Chang, Michal Monselise, Christopher C Yang","doi":"10.1007/s41666-020-00083-3","DOIUrl":null,"url":null,"abstract":"<p><p>With the novel coronavirus (COVID-19) pandemic affecting the lives of the citizens of over 200 countries, there is a need for policy makers and clinicians to understand public sentiment and track the spread of the disease. One of the sources for gaining valuable insight into public sentiment is through social media. This study aims to extract this insight by producing a list of the most discussed topics regarding COVID-19 on Twitter every week and monitoring the evolution of topics from week to week. This research will propose two topic mining that can handle a large-scale dataset-rolling online non-negative matrix factorization (Rolling-ONMF) and sliding online non-negative matrix factorization (Sliding-ONMF)-and compare the insights produced by both techniques. Each algorithm produces 425 topics over the course of 17 weeks. However, topics that have not evolved from one week to the next beyond a certain evolution threshold are consolidated into a single topic. Since the topics produced by the Rolling-ONMF algorithm each week depend on the topics from the previous week, we find that the Sliding-ONMF algorithm produces more varied topics each week; however, the topics produced by the Rolling-ONMF algorithm contain keywords that appear more consistent with each other when reviewing the terms manually. We also observe that the Sliding-ONMF algorithm is able to capture events that have shorter time frames rather than ones that last throughout many months while the Rolling-ONMF algorithm detects more general themes due to a higher average evolution score which leads to more topic consolidation. We have also conducted a qualitative analysis and grouped the detected topics into themes. A number of important themes such as government policy, economic crisis, COVID-19-related updates, COVID-19-related events, prevention, vaccines and treatments, and COVID-19 testing are identified. These reflected the concerns related to the pandemic expressed in social media.</p>","PeriodicalId":36444,"journal":{"name":"Journal of Healthcare Informatics Research","volume":null,"pages":null},"PeriodicalIF":5.9000,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7811869/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Healthcare Informatics Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s41666-020-00083-3","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2021/1/17 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 0

Abstract

With the novel coronavirus (COVID-19) pandemic affecting the lives of the citizens of over 200 countries, there is a need for policy makers and clinicians to understand public sentiment and track the spread of the disease. One of the sources for gaining valuable insight into public sentiment is through social media. This study aims to extract this insight by producing a list of the most discussed topics regarding COVID-19 on Twitter every week and monitoring the evolution of topics from week to week. This research will propose two topic mining that can handle a large-scale dataset-rolling online non-negative matrix factorization (Rolling-ONMF) and sliding online non-negative matrix factorization (Sliding-ONMF)-and compare the insights produced by both techniques. Each algorithm produces 425 topics over the course of 17 weeks. However, topics that have not evolved from one week to the next beyond a certain evolution threshold are consolidated into a single topic. Since the topics produced by the Rolling-ONMF algorithm each week depend on the topics from the previous week, we find that the Sliding-ONMF algorithm produces more varied topics each week; however, the topics produced by the Rolling-ONMF algorithm contain keywords that appear more consistent with each other when reviewing the terms manually. We also observe that the Sliding-ONMF algorithm is able to capture events that have shorter time frames rather than ones that last throughout many months while the Rolling-ONMF algorithm detects more general themes due to a higher average evolution score which leads to more topic consolidation. We have also conducted a qualitative analysis and grouped the detected topics into themes. A number of important themes such as government policy, economic crisis, COVID-19-related updates, COVID-19-related events, prevention, vaccines and treatments, and COVID-19 testing are identified. These reflected the concerns related to the pandemic expressed in social media.

Abstract Image

Abstract Image

Abstract Image

大流行期间人们关注什么?从 Twitter 上检测有关 COVID-19 的不断变化的话题。
随着新型冠状病毒(COVID-19)大流行影响到 200 多个国家公民的生活,政策制定者和临床医生需要了解公众情绪并追踪疾病的传播。社交媒体是了解公众情绪的重要渠道之一。本研究旨在通过制作每周在 Twitter 上讨论最多的 COVID-19 话题列表,并监测各周话题的演变情况,来提取这种洞察力。本研究将提出两种可处理大规模数据集的话题挖掘方法--滚动在线非负矩阵因式分解(Rolling-ONMF)和滑动在线非负矩阵因式分解(Sliding-ONMF)--并比较两种技术产生的洞察力。在 17 周的时间里,每种算法都产生了 425 个主题。但是,从一周到下一周没有超过一定演化阈值的话题会被合并为一个话题。由于滚动-ONMF 算法每周产生的话题取决于前一周的话题,因此我们发现滑动-ONMF 算法每周产生的话题更多样化;不过,滚动-ONMF 算法产生的话题包含的关键词在人工审核术语时似乎更一致。我们还观察到,滑动-ONMF 算法能够捕捉时间范围较短的事件,而不是持续数月的事件,而滚动-ONMF 算法由于平均演化分数较高而能检测到更多一般性主题,从而导致更多的主题合并。我们还进行了定性分析,并将检测到的主题进行了分组。我们发现了一些重要的主题,如政府政策、经济危机、COVID-19 相关更新、COVID-19 相关事件、预防、疫苗和治疗以及 COVID-19 测试。这些主题反映了社交媒体中表达的对大流行病的担忧。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Healthcare Informatics Research
Journal of Healthcare Informatics Research Computer Science-Computer Science Applications
CiteScore
13.60
自引率
1.70%
发文量
12
期刊介绍: Journal of Healthcare Informatics Research serves as a publication venue for the innovative technical contributions highlighting analytics, systems, and human factors research in healthcare informatics.Journal of Healthcare Informatics Research is concerned with the application of computer science principles, information science principles, information technology, and communication technology to address problems in healthcare, and everyday wellness. Journal of Healthcare Informatics Research highlights the most cutting-edge technical contributions in computing-oriented healthcare informatics.  The journal covers three major tracks: (1) analytics—focuses on data analytics, knowledge discovery, predictive modeling; (2) systems—focuses on building healthcare informatics systems (e.g., architecture, framework, design, engineering, and application); (3) human factors—focuses on understanding users or context, interface design, health behavior, and user studies of healthcare informatics applications.   Topics include but are not limited to: ·         healthcare software architecture, framework, design, and engineering;·         electronic health records·         medical data mining·         predictive modeling·         medical information retrieval·         medical natural language processing·         healthcare information systems·         smart health and connected health·         social media analytics·         mobile healthcare·         medical signal processing·         human factors in healthcare·         usability studies in healthcare·         user-interface design for medical devices and healthcare software·         health service delivery·         health games·         security and privacy in healthcare·         medical recommender system·         healthcare workflow management·         disease profiling and personalized treatment·         visualization of medical data·         intelligent medical devices and sensors·         RFID solutions for healthcare·         healthcare decision analytics and support systems·         epidemiological surveillance systems and intervention modeling·         consumer and clinician health information needs, seeking, sharing, and use·         semantic Web, linked data, and ontology·         collaboration technologies for healthcare·         assistive and adaptive ubiquitous computing technologies·         statistics and quality of medical data·         healthcare delivery in developing countries·         health systems modeling and simulation·         computer-aided diagnosis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信