Leveraging Google Search Data and Artificial Intelligence Methods for Provincial-level Influenza Forecasting: A South African Case Study

Seun O. Olukanmi, F. Nelwamondo, N. Nwulu
{"title":"Leveraging Google Search Data and Artificial Intelligence Methods for Provincial-level Influenza Forecasting: A South African Case Study","authors":"Seun O. Olukanmi, F. Nelwamondo, N. Nwulu","doi":"10.3991/ijoe.v18i11.29899","DOIUrl":null,"url":null,"abstract":"This paper investigates the usefulness of Google search patterns with Artificial Intelligence (AI) techniques for timely influenza-like illness (ILI) forecasting for each of the nine South African provinces. Traditional surveillance methods are limited by delays in reporting. Existing digital disease surveillance studies that employ alternative online data have scarcely explored sub-Saharan African countries. In South Africa, Google search data has only been recently studied for ILI surveillance at the national level. Meanwhile, the differences in socio-economic and technological conditions across provinces call for a finer spatial investigation. We perform correlation analysis between Google trends (GT) data for 21 ILI-related terms and real-life ILI surveillance data for each province. Next, we develop models to assess the predictive performance of these GT data for forecasting ILI rates, using time series, machine learning, and deep learning methods. We observe sufficient correlation for only two of the nine provinces: Gauteng and Western Cape. Thus, GT data could only be used to forecast ILI in these two provinces. Interestingly, these two provinces are regarded as the most economically developed. In the other seven provinces, LSTM, a deep learning technique, gives more accurate predictions than a baseline autoregressive model when only past ILI data are used for forecasting future ILI trends. The results reveal that, for provinces for which GT data is sufficiently available, it is not only free and fast, but is an effective predictor on its own as well as when added to past ILI data for forecasting future ILI infection rates. The correlation analysis suggests an association between provincial socio-economic development and the use of digital platforms for disease surveillance. Overall, the study established the need for finer scale ILI forecasting which will inform targeted planning for disease surveillance and interventions.","PeriodicalId":247144,"journal":{"name":"Int. J. Online Biomed. Eng.","volume":"177 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Online Biomed. Eng.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3991/ijoe.v18i11.29899","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

This paper investigates the usefulness of Google search patterns with Artificial Intelligence (AI) techniques for timely influenza-like illness (ILI) forecasting for each of the nine South African provinces. Traditional surveillance methods are limited by delays in reporting. Existing digital disease surveillance studies that employ alternative online data have scarcely explored sub-Saharan African countries. In South Africa, Google search data has only been recently studied for ILI surveillance at the national level. Meanwhile, the differences in socio-economic and technological conditions across provinces call for a finer spatial investigation. We perform correlation analysis between Google trends (GT) data for 21 ILI-related terms and real-life ILI surveillance data for each province. Next, we develop models to assess the predictive performance of these GT data for forecasting ILI rates, using time series, machine learning, and deep learning methods. We observe sufficient correlation for only two of the nine provinces: Gauteng and Western Cape. Thus, GT data could only be used to forecast ILI in these two provinces. Interestingly, these two provinces are regarded as the most economically developed. In the other seven provinces, LSTM, a deep learning technique, gives more accurate predictions than a baseline autoregressive model when only past ILI data are used for forecasting future ILI trends. The results reveal that, for provinces for which GT data is sufficiently available, it is not only free and fast, but is an effective predictor on its own as well as when added to past ILI data for forecasting future ILI infection rates. The correlation analysis suggests an association between provincial socio-economic development and the use of digital platforms for disease surveillance. Overall, the study established the need for finer scale ILI forecasting which will inform targeted planning for disease surveillance and interventions.
利用谷歌搜索数据和人工智能方法进行省级流感预测:南非案例研究
本文研究了谷歌搜索模式与人工智能(AI)技术对南非9个省中的每一个省的及时流感样疾病(ILI)预测的有用性。传统的监测方法受到报告延迟的限制。现有采用替代在线数据的数字疾病监测研究几乎没有探索撒哈拉以南非洲国家。在南非,谷歌搜索数据直到最近才在国家层面上用于ILI监测研究。与此同时,各省之间社会经济和技术条件的差异需要更精细的空间调查。我们对21个ILI相关术语的谷歌趋势(GT)数据与各省的实际ILI监测数据进行了相关性分析。接下来,我们利用时间序列、机器学习和深度学习方法,开发模型来评估这些GT数据在预测ILI率方面的预测性能。我们观察到九个省中只有两个省有足够的相关性:豪登省和西开普省。因此,GT数据只能用于这两个省的ILI预测。有趣的是,这两个省份被认为是经济最发达的省份。在其他7个省份,当仅使用过去的ILI数据预测未来ILI趋势时,LSTM(一种深度学习技术)给出的预测比基线自回归模型更准确。结果表明,对于GT数据充分可用的省份,它不仅是免费和快速的,而且是一个有效的预测器,并且当与过去的ILI数据相结合时,可以预测未来的ILI感染率。相关分析表明,省级社会经济发展与疾病监测数字平台的使用之间存在关联。总体而言,该研究确定需要进行更精细的ILI预测,这将为疾病监测和干预措施的有针对性规划提供信息。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信