Disease outbreak prediction using natural language processing: a review

IF 2.5 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Avneet Singh Gautam, Zahid Raza
{"title":"Disease outbreak prediction using natural language processing: a review","authors":"Avneet Singh Gautam, Zahid Raza","doi":"10.1007/s10115-024-02192-6","DOIUrl":null,"url":null,"abstract":"<p>Research on disease outbreak prediction has suddenly received an enormous interest owing to the COVID-19 pandemic. Natural language processing using user-generated text data has proven to be quite effective for the same. Disease outbreaks that occur frequently can be easily predicted, but novel disease outbreaks are difficult to predict. This review work attempts to summarize the research concerning disease outbreaks and the use of datasets such as news headlines, tweets, and search engine queries using natural language processing techniques. Existing state-of-the-art systems have been analytically discussed with their contributions and limitations. This work is an insight into the existing research in the domain of disease outbreak prediction. A total of 146 articles were reviewed in this study, and results show that news and Twitter datasets are being used most to predict disease outbreaks. This research underlines the fact that numerous works are available in the literature based on specific outbreak-related Internet-sourced text data, viz. news, tweets, and search engine queries. However, this becomes a limitation for any disease outbreak prediction system as it can predict only specific disease outbreaks and motivates the development of systems capable of disease outbreak prediction without any bias.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"43 1","pages":""},"PeriodicalIF":2.5000,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge and Information Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10115-024-02192-6","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Research on disease outbreak prediction has suddenly received an enormous interest owing to the COVID-19 pandemic. Natural language processing using user-generated text data has proven to be quite effective for the same. Disease outbreaks that occur frequently can be easily predicted, but novel disease outbreaks are difficult to predict. This review work attempts to summarize the research concerning disease outbreaks and the use of datasets such as news headlines, tweets, and search engine queries using natural language processing techniques. Existing state-of-the-art systems have been analytically discussed with their contributions and limitations. This work is an insight into the existing research in the domain of disease outbreak prediction. A total of 146 articles were reviewed in this study, and results show that news and Twitter datasets are being used most to predict disease outbreaks. This research underlines the fact that numerous works are available in the literature based on specific outbreak-related Internet-sourced text data, viz. news, tweets, and search engine queries. However, this becomes a limitation for any disease outbreak prediction system as it can predict only specific disease outbreaks and motivates the development of systems capable of disease outbreak prediction without any bias.

Abstract Image

利用自然语言处理预测疾病爆发:综述
由于 COVID-19 大流行,有关疾病爆发预测的研究突然受到了极大的关注。事实证明,利用用户生成的文本数据进行自然语言处理在这方面相当有效。经常发生的疾病爆发很容易预测,但新型疾病爆发却很难预测。本综述试图总结有关疾病爆发的研究,以及利用自然语言处理技术使用新闻标题、推特和搜索引擎查询等数据集的情况。文章分析讨论了现有的先进系统及其贡献和局限性。这项工作深入探讨了疾病爆发预测领域的现有研究。本研究共查阅了 146 篇文章,结果显示,新闻和 Twitter 数据集最常用于预测疾病爆发。这项研究强调了这样一个事实,即基于特定的疫情相关互联网源文本数据(即新闻、推特和搜索引擎查询)的文献中存在大量作品。然而,这对任何疾病爆发预测系统来说都是一种限制,因为它只能预测特定的疾病爆发,这就促使人们开发能够不带任何偏见地预测疾病爆发的系统。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Knowledge and Information Systems
Knowledge and Information Systems 工程技术-计算机:人工智能
CiteScore
5.70
自引率
7.40%
发文量
152
审稿时长
7.2 months
期刊介绍: Knowledge and Information Systems (KAIS) provides an international forum for researchers and professionals to share their knowledge and report new advances on all topics related to knowledge systems and advanced information systems. This monthly peer-reviewed archival journal publishes state-of-the-art research reports on emerging topics in KAIS, reviews of important techniques in related areas, and application papers of interest to a general readership.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信