Estimation of unconfirmed COVID-19 cases from a cross-sectional survey of >10 000 households and a symptom-based machine learning model in Gilgit-Baltistan, Pakistan.
Daniel S Farrar, Lisa G Pell, Yasin Muhammad, Sher Hafiz Khan, Lauren Erdman, Diego G Bassani, Zachary Tanner, Imran Ahmed Chauhadry, Muhammad Karim, Falak Madhani, Shariq Paracha, Masood Ali Khan, Sajid Soofi, Monica Taljaard, Rachel F Spitzer, Sarah M Abu Fadaleh, Zulfiqar A Bhutta, Shaun K Morris
{"title":"Estimation of unconfirmed COVID-19 cases from a cross-sectional survey of >10 000 households and a symptom-based machine learning model in Gilgit-Baltistan, Pakistan.","authors":"Daniel S Farrar, Lisa G Pell, Yasin Muhammad, Sher Hafiz Khan, Lauren Erdman, Diego G Bassani, Zachary Tanner, Imran Ahmed Chauhadry, Muhammad Karim, Falak Madhani, Shariq Paracha, Masood Ali Khan, Sajid Soofi, Monica Taljaard, Rachel F Spitzer, Sarah M Abu Fadaleh, Zulfiqar A Bhutta, Shaun K Morris","doi":"10.1136/bmjph-2024-001255","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Robust estimates of COVID-19 prevalence in settings with limited capacity for SARS-CoV-2 molecular and serologic testing are scarce. We aimed to describe the epidemiology of confirmed and probable COVID-19 in Gilgit-Baltistan, and to develop a symptom-based predictive model to identify infected but undiagnosed individuals with COVID-19.</p><p><strong>Methods: </strong>We conducted a cross-sectional survey in 10 257 randomly selected households in Gilgit-Baltistan from June to August 2021. Data regarding SARS-CoV-2 testing, healthcare worker (HCW) diagnoses, symptoms and outcomes since March 2020 were self-reported by households. 'Confirmed/probable' infection was defined as a positive test, HCW COVID-19 diagnosis or HCW pneumonia diagnosis with COVID-19-positive contact. Robust Poisson regression was conducted to assess differences in symptoms, outcomes and SARS-CoV-2 testing rates. We developed a symptom-based machine learning model to differentiate confirmed/probable infections from those with negative tests. We applied this model to untested respondents to estimate the total prevalence of SARS-CoV-2 infection.</p><p><strong>Results: </strong>Data were collected for 77 924 people. Overall, 314 (0.5%) had confirmed/probable infections, 3263 (4.4%) had negative tests and 74 347 (95.1%) were untested. Children were tested less often than adults (adjusted prevalence ratio (aPR) 0.08, 95% CI 0.06 to 0.12 for ages 1-4 years vs 30-39 years), while males were tested more often than females (aPR 1.51, 95% CI 1.40 to 1.63). In the predictive model, area under the receiver operating characteristic curve was 0.92 (95% CI 0.90 to 0.93). We estimate there were 8-17 total SARS-CoV-2 infections for each positive test (8-17:1). The ratio of estimated to confirmed cases was higher for ages 1-4 years (211-480:1), 5-9 years (80-185:1) and for females (13-25:1).</p><p><strong>Conclusions: </strong>From March 2020 to August 2021, the majority of SARS-CoV-2 infections in Gilgit-Baltistan went unconfirmed, particularly among women and children. Predictive models which incorporate self-reported symptoms may improve understanding of the burden of disease in settings lacking diagnostic capacity.</p>","PeriodicalId":101362,"journal":{"name":"BMJ public health","volume":"3 1","pages":"e001255"},"PeriodicalIF":0.0000,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12039044/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMJ public health","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1136/bmjph-2024-001255","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction: Robust estimates of COVID-19 prevalence in settings with limited capacity for SARS-CoV-2 molecular and serologic testing are scarce. We aimed to describe the epidemiology of confirmed and probable COVID-19 in Gilgit-Baltistan, and to develop a symptom-based predictive model to identify infected but undiagnosed individuals with COVID-19.
Methods: We conducted a cross-sectional survey in 10 257 randomly selected households in Gilgit-Baltistan from June to August 2021. Data regarding SARS-CoV-2 testing, healthcare worker (HCW) diagnoses, symptoms and outcomes since March 2020 were self-reported by households. 'Confirmed/probable' infection was defined as a positive test, HCW COVID-19 diagnosis or HCW pneumonia diagnosis with COVID-19-positive contact. Robust Poisson regression was conducted to assess differences in symptoms, outcomes and SARS-CoV-2 testing rates. We developed a symptom-based machine learning model to differentiate confirmed/probable infections from those with negative tests. We applied this model to untested respondents to estimate the total prevalence of SARS-CoV-2 infection.
Results: Data were collected for 77 924 people. Overall, 314 (0.5%) had confirmed/probable infections, 3263 (4.4%) had negative tests and 74 347 (95.1%) were untested. Children were tested less often than adults (adjusted prevalence ratio (aPR) 0.08, 95% CI 0.06 to 0.12 for ages 1-4 years vs 30-39 years), while males were tested more often than females (aPR 1.51, 95% CI 1.40 to 1.63). In the predictive model, area under the receiver operating characteristic curve was 0.92 (95% CI 0.90 to 0.93). We estimate there were 8-17 total SARS-CoV-2 infections for each positive test (8-17:1). The ratio of estimated to confirmed cases was higher for ages 1-4 years (211-480:1), 5-9 years (80-185:1) and for females (13-25:1).
Conclusions: From March 2020 to August 2021, the majority of SARS-CoV-2 infections in Gilgit-Baltistan went unconfirmed, particularly among women and children. Predictive models which incorporate self-reported symptoms may improve understanding of the burden of disease in settings lacking diagnostic capacity.