基于机器学习的儿童百日咳风险预测模型:一项多中心回顾性研究。

IF 3.4 3区 医学 Q2 INFECTIOUS DISEASES
Juan Xie, Run-Wei Ma, Yu-Jing Feng, Yuan Qiao, Hong-Yan Zhu, Xing-Ping Tao, Wen-Juan Chen, Cong-Yun Liu, Tan Li, Kai Liu, Li-Ming Cheng
{"title":"基于机器学习的儿童百日咳风险预测模型:一项多中心回顾性研究。","authors":"Juan Xie, Run-Wei Ma, Yu-Jing Feng, Yuan Qiao, Hong-Yan Zhu, Xing-Ping Tao, Wen-Juan Chen, Cong-Yun Liu, Tan Li, Kai Liu, Li-Ming Cheng","doi":"10.1186/s12879-025-10797-7","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Pertussis is a highly contagious respiratory disease. Even though vaccination has reduced the incidence, cases have resurfaced in certain regions due to immune escape and waning vaccine efficacy. Identifying high-risk patients to mitigate transmission and avert complications promptly is imperative. Nevertheless, the current diagnostic methods, including PCR and bacterial culture, are time-consuming and expensive. Some studies have attempted to develop risk prediction models based on multivariate data, but their performance can be improved. Therefore, this study aims to further optimize and expand the risk assessment tool to more efficiently identify high-risk individuals and compensate for the shortcomings of existing diagnostic methods.</p><p><strong>Objective: </strong>The aim of this study was to develop a pertussis risk prediction model that is both efficient and has good generalization ability, applicable to different datasets. The model was constructed using machine learning techniques based on multicenter data and screened for key features. The performance and generalization ability of the model were evaluated by deploying it on an online platform. At the same time, this study aims to provide a rapid and accurate auxiliary diagnostic tool for clinical practice to help identify high-risk patients in a timely manner, optimize early intervention strategies, reduce the risk of complications and reduce transmission, thereby improving the efficiency of public health management.</p><p><strong>Methods: </strong>First, data from 1085 suspected pertussis patients from 7 centers were collected, and ten key features were analyzed using the lasso regression and Boruta algorithm: PDW-MPV-RATIO, SII, white blood cells, platelet distribution width, mean platelet volume, lymphocytes, cough duration, vaccination, fever, and lytic lymphocytes.Eight models were then trained and validated to assess their performance and to confirm their generalization ability with external datasets based on these features. Finally, an online platform was constructed for clinicians to use the models in real time.</p><p><strong>Results: </strong>The random forest model demonstrated excellent discrimination ability in the validation set, with an AUC of 0.98, and an AUC of 0.97 in the external validation set. Calibration curve and decision curve analysis showed that the model had high accuracy in predicting low-to-medium risk patients, which could help clinicians avoid unnecessary interventions, especially in resource-limited settings. The application of this model can help optimize the early identification and management of high-risk patients and improve clinical decision-making.</p><p><strong>Conclusion: </strong>The pertussis prediction model devised in this study was validated using multicenter data, exhibited high prediction performance, and was successfully implemented online. Future research should broaden the data sources and incorporate dynamic data to enhance the model's accuracy and applicability.</p>","PeriodicalId":8981,"journal":{"name":"BMC Infectious Diseases","volume":"25 1","pages":"428"},"PeriodicalIF":3.4000,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11951648/pdf/","citationCount":"0","resultStr":"{\"title\":\"Machine learning-based risk prediction model for pertussis in children: a multicenter retrospective study.\",\"authors\":\"Juan Xie, Run-Wei Ma, Yu-Jing Feng, Yuan Qiao, Hong-Yan Zhu, Xing-Ping Tao, Wen-Juan Chen, Cong-Yun Liu, Tan Li, Kai Liu, Li-Ming Cheng\",\"doi\":\"10.1186/s12879-025-10797-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Pertussis is a highly contagious respiratory disease. Even though vaccination has reduced the incidence, cases have resurfaced in certain regions due to immune escape and waning vaccine efficacy. Identifying high-risk patients to mitigate transmission and avert complications promptly is imperative. Nevertheless, the current diagnostic methods, including PCR and bacterial culture, are time-consuming and expensive. Some studies have attempted to develop risk prediction models based on multivariate data, but their performance can be improved. Therefore, this study aims to further optimize and expand the risk assessment tool to more efficiently identify high-risk individuals and compensate for the shortcomings of existing diagnostic methods.</p><p><strong>Objective: </strong>The aim of this study was to develop a pertussis risk prediction model that is both efficient and has good generalization ability, applicable to different datasets. The model was constructed using machine learning techniques based on multicenter data and screened for key features. The performance and generalization ability of the model were evaluated by deploying it on an online platform. At the same time, this study aims to provide a rapid and accurate auxiliary diagnostic tool for clinical practice to help identify high-risk patients in a timely manner, optimize early intervention strategies, reduce the risk of complications and reduce transmission, thereby improving the efficiency of public health management.</p><p><strong>Methods: </strong>First, data from 1085 suspected pertussis patients from 7 centers were collected, and ten key features were analyzed using the lasso regression and Boruta algorithm: PDW-MPV-RATIO, SII, white blood cells, platelet distribution width, mean platelet volume, lymphocytes, cough duration, vaccination, fever, and lytic lymphocytes.Eight models were then trained and validated to assess their performance and to confirm their generalization ability with external datasets based on these features. Finally, an online platform was constructed for clinicians to use the models in real time.</p><p><strong>Results: </strong>The random forest model demonstrated excellent discrimination ability in the validation set, with an AUC of 0.98, and an AUC of 0.97 in the external validation set. Calibration curve and decision curve analysis showed that the model had high accuracy in predicting low-to-medium risk patients, which could help clinicians avoid unnecessary interventions, especially in resource-limited settings. The application of this model can help optimize the early identification and management of high-risk patients and improve clinical decision-making.</p><p><strong>Conclusion: </strong>The pertussis prediction model devised in this study was validated using multicenter data, exhibited high prediction performance, and was successfully implemented online. Future research should broaden the data sources and incorporate dynamic data to enhance the model's accuracy and applicability.</p>\",\"PeriodicalId\":8981,\"journal\":{\"name\":\"BMC Infectious Diseases\",\"volume\":\"25 1\",\"pages\":\"428\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-03-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11951648/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Infectious Diseases\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1186/s12879-025-10797-7\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"INFECTIOUS DISEASES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Infectious Diseases","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12879-025-10797-7","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INFECTIOUS DISEASES","Score":null,"Total":0}
引用次数: 0

摘要

背景:百日咳是一种高度传染性的呼吸道疾病。尽管疫苗接种降低了发病率,但由于免疫逃逸和疫苗效力减弱,某些地区的病例再次出现。确定高危患者以减轻传播并及时避免并发症至关重要。然而,目前的诊断方法,包括PCR和细菌培养,既耗时又昂贵。一些研究试图建立基于多变量数据的风险预测模型,但其性能有待提高。因此,本研究旨在进一步优化和扩展风险评估工具,以更有效地识别高危人群,弥补现有诊断方法的不足。目的:本研究旨在建立一种既高效又具有良好泛化能力,适用于不同数据集的百日咳风险预测模型。该模型使用基于多中心数据的机器学习技术构建,并筛选关键特征。通过在在线平台上的部署,对模型的性能和泛化能力进行了评估。同时,本研究旨在为临床实践提供一种快速准确的辅助诊断工具,帮助及时识别高危患者,优化早期干预策略,降低并发症风险,减少传播,从而提高公共卫生管理效率。方法:首先收集来自7个中心的1085例疑似百日咳患者的资料,采用lasso回归和Boruta算法分析10个关键特征:PDW-MPV-RATIO、SII、白细胞、血小板分布宽度、平均血小板体积、淋巴细胞、咳嗽持续时间、疫苗接种、发热、溶解淋巴细胞。然后对8个模型进行训练和验证,以评估它们的性能,并根据这些特征确认它们与外部数据集的泛化能力。最后,构建了临床医生实时使用模型的在线平台。结果:随机森林模型在验证集中表现出良好的识别能力,AUC为0.98,在外部验证集中AUC为0.97。校正曲线和决策曲线分析表明,该模型在预测中低危患者方面具有较高的准确性,可以帮助临床医生避免不必要的干预,特别是在资源有限的情况下。该模型的应用有助于优化高危患者的早期识别和管理,提高临床决策水平。结论:本研究建立的百日咳预测模型经多中心数据验证,预测效果良好,可成功在线实施。未来的研究应拓宽数据来源,纳入动态数据,以提高模型的准确性和适用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Machine learning-based risk prediction model for pertussis in children: a multicenter retrospective study.

Background: Pertussis is a highly contagious respiratory disease. Even though vaccination has reduced the incidence, cases have resurfaced in certain regions due to immune escape and waning vaccine efficacy. Identifying high-risk patients to mitigate transmission and avert complications promptly is imperative. Nevertheless, the current diagnostic methods, including PCR and bacterial culture, are time-consuming and expensive. Some studies have attempted to develop risk prediction models based on multivariate data, but their performance can be improved. Therefore, this study aims to further optimize and expand the risk assessment tool to more efficiently identify high-risk individuals and compensate for the shortcomings of existing diagnostic methods.

Objective: The aim of this study was to develop a pertussis risk prediction model that is both efficient and has good generalization ability, applicable to different datasets. The model was constructed using machine learning techniques based on multicenter data and screened for key features. The performance and generalization ability of the model were evaluated by deploying it on an online platform. At the same time, this study aims to provide a rapid and accurate auxiliary diagnostic tool for clinical practice to help identify high-risk patients in a timely manner, optimize early intervention strategies, reduce the risk of complications and reduce transmission, thereby improving the efficiency of public health management.

Methods: First, data from 1085 suspected pertussis patients from 7 centers were collected, and ten key features were analyzed using the lasso regression and Boruta algorithm: PDW-MPV-RATIO, SII, white blood cells, platelet distribution width, mean platelet volume, lymphocytes, cough duration, vaccination, fever, and lytic lymphocytes.Eight models were then trained and validated to assess their performance and to confirm their generalization ability with external datasets based on these features. Finally, an online platform was constructed for clinicians to use the models in real time.

Results: The random forest model demonstrated excellent discrimination ability in the validation set, with an AUC of 0.98, and an AUC of 0.97 in the external validation set. Calibration curve and decision curve analysis showed that the model had high accuracy in predicting low-to-medium risk patients, which could help clinicians avoid unnecessary interventions, especially in resource-limited settings. The application of this model can help optimize the early identification and management of high-risk patients and improve clinical decision-making.

Conclusion: The pertussis prediction model devised in this study was validated using multicenter data, exhibited high prediction performance, and was successfully implemented online. Future research should broaden the data sources and incorporate dynamic data to enhance the model's accuracy and applicability.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
BMC Infectious Diseases
BMC Infectious Diseases 医学-传染病学
CiteScore
6.50
自引率
0.00%
发文量
860
审稿时长
3.3 months
期刊介绍: BMC Infectious Diseases is an open access, peer-reviewed journal that considers articles on all aspects of the prevention, diagnosis and management of infectious and sexually transmitted diseases in humans, as well as related molecular genetics, pathophysiology, and epidemiology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信