分析在COVID-19病例预测模型中使用人员流动性数据所带来的绩效改进和偏差

S. Abrar, N. Awasthi, D. Smolyak, V. Frías-Martínez
{"title":"分析在COVID-19病例预测模型中使用人员流动性数据所带来的绩效改进和偏差","authors":"S. Abrar, N. Awasthi, D. Smolyak, V. Frías-Martínez","doi":"10.1145/3616380","DOIUrl":null,"url":null,"abstract":"The COVID-19 pandemic has mainstreamed human mobility data into the public domain, with research focused on understanding the impact of mobility reduction policies as well as on regional COVID-19 case prediction models. Nevertheless, current research on COVID-19 case prediction tends to focus on performance improvements, masking relevant insights about when mobility data does not help, and more importantly, why, so that it can adequately inform local decision making. In this paper, we carry out a systematic analysis to reveal the conditions under which human mobility data provides (or not) an enhancement over individual regional COVID-19 case prediction models that do not use mobility as a source of information. Our analysis - focused on US county-based COVID-19 case prediction models - shows that (1) at most, 60% of counties improve their performance after adding mobility data; (2) that the performance improvements are modest, with median correlation improvements of approximately 0.13; (3) that improvements were lower for counties with higher Black, Hispanic, and other non-White populations as well as low-income and rural populations, pointing to potential bias in the mobility data negatively impacting predictive performance; and that (4) different mobility datasets, predictive models and training approaches bring about diverse performance improvements.","PeriodicalId":238057,"journal":{"name":"ACM Journal on Computing and Sustainable Societies","volume":"07 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Analysis of performance improvements and bias associated with the use of human mobility data in COVID-19 case prediction models\",\"authors\":\"S. Abrar, N. Awasthi, D. Smolyak, V. Frías-Martínez\",\"doi\":\"10.1145/3616380\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The COVID-19 pandemic has mainstreamed human mobility data into the public domain, with research focused on understanding the impact of mobility reduction policies as well as on regional COVID-19 case prediction models. Nevertheless, current research on COVID-19 case prediction tends to focus on performance improvements, masking relevant insights about when mobility data does not help, and more importantly, why, so that it can adequately inform local decision making. In this paper, we carry out a systematic analysis to reveal the conditions under which human mobility data provides (or not) an enhancement over individual regional COVID-19 case prediction models that do not use mobility as a source of information. Our analysis - focused on US county-based COVID-19 case prediction models - shows that (1) at most, 60% of counties improve their performance after adding mobility data; (2) that the performance improvements are modest, with median correlation improvements of approximately 0.13; (3) that improvements were lower for counties with higher Black, Hispanic, and other non-White populations as well as low-income and rural populations, pointing to potential bias in the mobility data negatively impacting predictive performance; and that (4) different mobility datasets, predictive models and training approaches bring about diverse performance improvements.\",\"PeriodicalId\":238057,\"journal\":{\"name\":\"ACM Journal on Computing and Sustainable Societies\",\"volume\":\"07 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-08-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Journal on Computing and Sustainable Societies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3616380\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Journal on Computing and Sustainable Societies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3616380","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

2019冠状病毒病大流行已将人类流动数据纳入公共领域主流,研究重点是了解减少流动政策的影响以及对区域COVID-19病例预测模型的影响。然而,目前关于COVID-19病例预测的研究往往侧重于性能改进,掩盖了关于移动数据何时不起作用,更重要的是,为什么不起作用的相关见解,以便为当地决策提供充分的信息。在本文中,我们进行了系统分析,以揭示在哪些条件下,人类流动性数据比不使用流动性作为信息来源的单个区域COVID-19病例预测模型提供(或不提供)增强。我们的分析主要集中在美国基于县的COVID-19病例预测模型上,结果表明:(1)最多有60%的县在增加流动性数据后提高了绩效;(2)性能改进是适度的,中位数相关改进约为0.13;(3)在黑人、西班牙裔和其他非白人人口以及低收入和农村人口较多的县,改善程度较低,这表明流动性数据的潜在偏差对预测性能产生了负面影响;(4)不同的移动性数据集、预测模型和训练方法带来不同的性能提升。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Analysis of performance improvements and bias associated with the use of human mobility data in COVID-19 case prediction models
The COVID-19 pandemic has mainstreamed human mobility data into the public domain, with research focused on understanding the impact of mobility reduction policies as well as on regional COVID-19 case prediction models. Nevertheless, current research on COVID-19 case prediction tends to focus on performance improvements, masking relevant insights about when mobility data does not help, and more importantly, why, so that it can adequately inform local decision making. In this paper, we carry out a systematic analysis to reveal the conditions under which human mobility data provides (or not) an enhancement over individual regional COVID-19 case prediction models that do not use mobility as a source of information. Our analysis - focused on US county-based COVID-19 case prediction models - shows that (1) at most, 60% of counties improve their performance after adding mobility data; (2) that the performance improvements are modest, with median correlation improvements of approximately 0.13; (3) that improvements were lower for counties with higher Black, Hispanic, and other non-White populations as well as low-income and rural populations, pointing to potential bias in the mobility data negatively impacting predictive performance; and that (4) different mobility datasets, predictive models and training approaches bring about diverse performance improvements.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信