S. Abrar, N. Awasthi, D. Smolyak, V. Frías-Martínez
{"title":"Analysis of performance improvements and bias associated with the use of human mobility data in COVID-19 case prediction models","authors":"S. Abrar, N. Awasthi, D. Smolyak, V. Frías-Martínez","doi":"10.1145/3616380","DOIUrl":null,"url":null,"abstract":"The COVID-19 pandemic has mainstreamed human mobility data into the public domain, with research focused on understanding the impact of mobility reduction policies as well as on regional COVID-19 case prediction models. Nevertheless, current research on COVID-19 case prediction tends to focus on performance improvements, masking relevant insights about when mobility data does not help, and more importantly, why, so that it can adequately inform local decision making. In this paper, we carry out a systematic analysis to reveal the conditions under which human mobility data provides (or not) an enhancement over individual regional COVID-19 case prediction models that do not use mobility as a source of information. Our analysis - focused on US county-based COVID-19 case prediction models - shows that (1) at most, 60% of counties improve their performance after adding mobility data; (2) that the performance improvements are modest, with median correlation improvements of approximately 0.13; (3) that improvements were lower for counties with higher Black, Hispanic, and other non-White populations as well as low-income and rural populations, pointing to potential bias in the mobility data negatively impacting predictive performance; and that (4) different mobility datasets, predictive models and training approaches bring about diverse performance improvements.","PeriodicalId":238057,"journal":{"name":"ACM Journal on Computing and Sustainable Societies","volume":"07 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Journal on Computing and Sustainable Societies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3616380","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The COVID-19 pandemic has mainstreamed human mobility data into the public domain, with research focused on understanding the impact of mobility reduction policies as well as on regional COVID-19 case prediction models. Nevertheless, current research on COVID-19 case prediction tends to focus on performance improvements, masking relevant insights about when mobility data does not help, and more importantly, why, so that it can adequately inform local decision making. In this paper, we carry out a systematic analysis to reveal the conditions under which human mobility data provides (or not) an enhancement over individual regional COVID-19 case prediction models that do not use mobility as a source of information. Our analysis - focused on US county-based COVID-19 case prediction models - shows that (1) at most, 60% of counties improve their performance after adding mobility data; (2) that the performance improvements are modest, with median correlation improvements of approximately 0.13; (3) that improvements were lower for counties with higher Black, Hispanic, and other non-White populations as well as low-income and rural populations, pointing to potential bias in the mobility data negatively impacting predictive performance; and that (4) different mobility datasets, predictive models and training approaches bring about diverse performance improvements.