在巴基斯坦吉尔吉特-巴尔蒂斯坦对100万户家庭的横断面调查和基于症状的机器学习模型中估计未确诊的COVID-19病例

BMJ public health Pub Date : 2025-04-28 eCollection Date: 2025-01-01 DOI:10.1136/bmjph-2024-001255

Daniel S Farrar, Lisa G Pell, Yasin Muhammad, Sher Hafiz Khan, Lauren Erdman, Diego G Bassani, Zachary Tanner, Imran Ahmed Chauhadry, Muhammad Karim, Falak Madhani, Shariq Paracha, Masood Ali Khan, Sajid Soofi, Monica Taljaard, Rachel F Spitzer, Sarah M Abu Fadaleh, Zulfiqar A Bhutta, Shaun K Morris

{"title":"在巴基斯坦吉尔吉特-巴尔蒂斯坦对100万户家庭的横断面调查和基于症状的机器学习模型中估计未确诊的COVID-19病例","authors":"Daniel S Farrar, Lisa G Pell, Yasin Muhammad, Sher Hafiz Khan, Lauren Erdman, Diego G Bassani, Zachary Tanner, Imran Ahmed Chauhadry, Muhammad Karim, Falak Madhani, Shariq Paracha, Masood Ali Khan, Sajid Soofi, Monica Taljaard, Rachel F Spitzer, Sarah M Abu Fadaleh, Zulfiqar A Bhutta, Shaun K Morris","doi":"10.1136/bmjph-2024-001255","DOIUrl":null,"url":null,"abstract":"Introduction: Robust estimates of COVID-19 prevalence in settings with limited capacity for SARS-CoV-2 molecular and serologic testing are scarce. We aimed to describe the epidemiology of confirmed and probable COVID-19 in Gilgit-Baltistan, and to develop a symptom-based predictive model to identify infected but undiagnosed individuals with COVID-19.Methods: We conducted a cross-sectional survey in 10 257 randomly selected households in Gilgit-Baltistan from June to August 2021. Data regarding SARS-CoV-2 testing, healthcare worker (HCW) diagnoses, symptoms and outcomes since March 2020 were self-reported by households. 'Confirmed/probable' infection was defined as a positive test, HCW COVID-19 diagnosis or HCW pneumonia diagnosis with COVID-19-positive contact. Robust Poisson regression was conducted to assess differences in symptoms, outcomes and SARS-CoV-2 testing rates. We developed a symptom-based machine learning model to differentiate confirmed/probable infections from those with negative tests. We applied this model to untested respondents to estimate the total prevalence of SARS-CoV-2 infection.Results: Data were collected for 77 924 people. Overall, 314 (0.5%) had confirmed/probable infections, 3263 (4.4%) had negative tests and 74 347 (95.1%) were untested. Children were tested less often than adults (adjusted prevalence ratio (aPR) 0.08, 95% CI 0.06 to 0.12 for ages 1-4 years vs 30-39 years), while males were tested more often than females (aPR 1.51, 95% CI 1.40 to 1.63). In the predictive model, area under the receiver operating characteristic curve was 0.92 (95% CI 0.90 to 0.93). We estimate there were 8-17 total SARS-CoV-2 infections for each positive test (8-17:1). The ratio of estimated to confirmed cases was higher for ages 1-4 years (211-480:1), 5-9 years (80-185:1) and for females (13-25:1).Conclusions: From March 2020 to August 2021, the majority of SARS-CoV-2 infections in Gilgit-Baltistan went unconfirmed, particularly among women and children. Predictive models which incorporate self-reported symptoms may improve understanding of the burden of disease in settings lacking diagnostic capacity.","PeriodicalId":101362,"journal":{"name":"BMJ public health","volume":"3 1","pages":"e001255"},"PeriodicalIF":0.0000,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12039044/pdf/","citationCount":"0","resultStr":"{\"title\":\"Estimation of unconfirmed COVID-19 cases from a cross-sectional survey of >10 000 households and a symptom-based machine learning model in Gilgit-Baltistan, Pakistan.\",\"authors\":\"Daniel S Farrar, Lisa G Pell, Yasin Muhammad, Sher Hafiz Khan, Lauren Erdman, Diego G Bassani, Zachary Tanner, Imran Ahmed Chauhadry, Muhammad Karim, Falak Madhani, Shariq Paracha, Masood Ali Khan, Sajid Soofi, Monica Taljaard, Rachel F Spitzer, Sarah M Abu Fadaleh, Zulfiqar A Bhutta, Shaun K Morris\",\"doi\":\"10.1136/bmjph-2024-001255\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Introduction: Robust estimates of COVID-19 prevalence in settings with limited capacity for SARS-CoV-2 molecular and serologic testing are scarce. We aimed to describe the epidemiology of confirmed and probable COVID-19 in Gilgit-Baltistan, and to develop a symptom-based predictive model to identify infected but undiagnosed individuals with COVID-19.Methods: We conducted a cross-sectional survey in 10 257 randomly selected households in Gilgit-Baltistan from June to August 2021. Data regarding SARS-CoV-2 testing, healthcare worker (HCW) diagnoses, symptoms and outcomes since March 2020 were self-reported by households. 'Confirmed/probable' infection was defined as a positive test, HCW COVID-19 diagnosis or HCW pneumonia diagnosis with COVID-19-positive contact. Robust Poisson regression was conducted to assess differences in symptoms, outcomes and SARS-CoV-2 testing rates. We developed a symptom-based machine learning model to differentiate confirmed/probable infections from those with negative tests. We applied this model to untested respondents to estimate the total prevalence of SARS-CoV-2 infection.Results: Data were collected for 77 924 people. Overall, 314 (0.5%) had confirmed/probable infections, 3263 (4.4%) had negative tests and 74 347 (95.1%) were untested. Children were tested less often than adults (adjusted prevalence ratio (aPR) 0.08, 95% CI 0.06 to 0.12 for ages 1-4 years vs 30-39 years), while males were tested more often than females (aPR 1.51, 95% CI 1.40 to 1.63). In the predictive model, area under the receiver operating characteristic curve was 0.92 (95% CI 0.90 to 0.93). We estimate there were 8-17 total SARS-CoV-2 infections for each positive test (8-17:1). The ratio of estimated to confirmed cases was higher for ages 1-4 years (211-480:1), 5-9 years (80-185:1) and for females (13-25:1).Conclusions: From March 2020 to August 2021, the majority of SARS-CoV-2 infections in Gilgit-Baltistan went unconfirmed, particularly among women and children. Predictive models which incorporate self-reported symptoms may improve understanding of the burden of disease in settings lacking diagnostic capacity.\",\"PeriodicalId\":101362,\"journal\":{\"name\":\"BMJ public health\",\"volume\":\"3 1\",\"pages\":\"e001255\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-04-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12039044/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMJ public health\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1136/bmjph-2024-001255\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMJ public health","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1136/bmjph-2024-001255","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在SARS-CoV-2分子和血清学检测能力有限的环境中，缺乏对COVID-19流行率的可靠估计。我们的目的是描述吉尔吉特-巴尔蒂斯坦确诊和疑似COVID-19的流行病学，并建立基于症状的预测模型，以识别感染但未确诊的COVID-19个体。方法：我们于2021年6月至8月在吉尔吉特-巴尔蒂斯坦随机抽取10257户家庭进行横断面调查。自2020年3月以来，有关SARS-CoV-2检测、医护人员诊断、症状和结果的数据由家庭自我报告。“确诊/可能”感染定义为检测阳性、HCW COVID-19诊断或HCW肺炎诊断与COVID-19阳性接触。采用稳健泊松回归来评估症状、结局和SARS-CoV-2检测率的差异。我们开发了一种基于症状的机器学习模型，以区分确诊/可能感染与阴性检测的感染。我们将该模型应用于未经测试的受访者，以估计SARS-CoV-2感染的总流行率。结果：共收集资料77 924人。总体而言，314人（0.5%）确诊/可能感染，3263人（4.4%）检测阴性，74 347人（95.1%）未检测。儿童的检测频率低于成人（1-4岁与30-39岁的调整患病率比（aPR） 0.08, 95% CI 0.06至0.12），而男性的检测频率高于女性（aPR 1.51, 95% CI 1.40至1.63）。在预测模型中，受试者工作特征曲线下面积为0.92 （95% CI 0.90 ~ 0.93）。我们估计每次检测阳性的SARS-CoV-2感染总数为8-17例（8-17:1）。估计病例与确诊病例的比例在1-4岁（211-480:1）、5-9岁（80-185:1）和女性（13-25:1）中较高。结论：从2020年3月到2021年8月，吉尔吉特-巴尔蒂斯坦的大多数SARS-CoV-2感染未得到证实，特别是在妇女和儿童中。在缺乏诊断能力的环境中，纳入自我报告症状的预测模型可提高对疾病负担的认识。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Estimation of unconfirmed COVID-19 cases from a cross-sectional survey of >10 000 households and a symptom-based machine learning model in Gilgit-Baltistan, Pakistan.

Introduction: Robust estimates of COVID-19 prevalence in settings with limited capacity for SARS-CoV-2 molecular and serologic testing are scarce. We aimed to describe the epidemiology of confirmed and probable COVID-19 in Gilgit-Baltistan, and to develop a symptom-based predictive model to identify infected but undiagnosed individuals with COVID-19.

Methods: We conducted a cross-sectional survey in 10 257 randomly selected households in Gilgit-Baltistan from June to August 2021. Data regarding SARS-CoV-2 testing, healthcare worker (HCW) diagnoses, symptoms and outcomes since March 2020 were self-reported by households. 'Confirmed/probable' infection was defined as a positive test, HCW COVID-19 diagnosis or HCW pneumonia diagnosis with COVID-19-positive contact. Robust Poisson regression was conducted to assess differences in symptoms, outcomes and SARS-CoV-2 testing rates. We developed a symptom-based machine learning model to differentiate confirmed/probable infections from those with negative tests. We applied this model to untested respondents to estimate the total prevalence of SARS-CoV-2 infection.

Results: Data were collected for 77 924 people. Overall, 314 (0.5%) had confirmed/probable infections, 3263 (4.4%) had negative tests and 74 347 (95.1%) were untested. Children were tested less often than adults (adjusted prevalence ratio (aPR) 0.08, 95% CI 0.06 to 0.12 for ages 1-4 years vs 30-39 years), while males were tested more often than females (aPR 1.51, 95% CI 1.40 to 1.63). In the predictive model, area under the receiver operating characteristic curve was 0.92 (95% CI 0.90 to 0.93). We estimate there were 8-17 total SARS-CoV-2 infections for each positive test (8-17:1). The ratio of estimated to confirmed cases was higher for ages 1-4 years (211-480:1), 5-9 years (80-185:1) and for females (13-25:1).

Conclusions: From March 2020 to August 2021, the majority of SARS-CoV-2 infections in Gilgit-Baltistan went unconfirmed, particularly among women and children. Predictive models which incorporate self-reported symptoms may improve understanding of the burden of disease in settings lacking diagnostic capacity.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

BMJ public health

自引率

0.00%

发文量