Hao Chen, Fanxuan Chen, Yijun Wang, Enna Cai, Wangzheng Pan, Yichen Li, Zefei Mo, Hao Lou, Chufan Ren, Chenyue Dai, Xingbo Shan, Hui Ye, Zhenwei Xu, Pu Dong, Han Zhou, Shuya Xu, Tianye Zhu, Mingzhi Su, Xingguo Miao, Xiaoqu Hu, Liang Hong, Yi Wang, Feifei Su
{"title":"诊断HIV患者机会性感染的机器学习模型:跨感染类型的广泛适用性","authors":"Hao Chen, Fanxuan Chen, Yijun Wang, Enna Cai, Wangzheng Pan, Yichen Li, Zefei Mo, Hao Lou, Chufan Ren, Chenyue Dai, Xingbo Shan, Hui Ye, Zhenwei Xu, Pu Dong, Han Zhou, Shuya Xu, Tianye Zhu, Mingzhi Su, Xingguo Miao, Xiaoqu Hu, Liang Hong, Yi Wang, Feifei Su","doi":"10.1111/jcmm.70497","DOIUrl":null,"url":null,"abstract":"<p>Opportunistic infections (OIs) are the leading cause of hospitalisation and mortality among Human Immunodeficiency Virus-infected (HIV-infected) patients. The diverse pathogen types and intricate clinical manifestations associated present a formidable challenge to the timely diagnosis of these infections. This study aims to use machine learning techniques to develop a diagnostic model that quickly identifies whether HIV-infected patients have any type of OIs, without being limited to specific infections, thus adapting to various clinical scenarios. This study is a retrospective cohort study that collected clinical data from HIV-infected patients at four healthcare organisations in China. A total of twelve machine learning classification algorithms were employed for the purposes of model training and evaluation. Additionally, feature reduction was conducted through the implementation of an importance ranking, with the objective of eliminating any redundant features. In conclusion, both the five features based on Shapley additive explanations (procalcitonin, haemoglobin, lymphocyte, creatinine, platelet) and the five features based on Permutation Importance explanations (procalcitonin, lymphocyte, haemoglobin, creatinine, indirect bilirubin) achieved the highest F1 score when evaluated using the adaptive boosting classifier model. The scores on the test set were 0.9016 and 0.9063, respectively, which significantly outperformed the best 32-feature model, gradient boosting classifier, which had a test set F1 score of 0.8991.</p>","PeriodicalId":101321,"journal":{"name":"JOURNAL OF CELLULAR AND MOLECULAR MEDICINE","volume":"29 6","pages":""},"PeriodicalIF":5.3000,"publicationDate":"2025-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/jcmm.70497","citationCount":"0","resultStr":"{\"title\":\"A Machine Learning Model for Diagnosing Opportunistic Infections in HIV Patients: Broad Applicability Across Infection Types\",\"authors\":\"Hao Chen, Fanxuan Chen, Yijun Wang, Enna Cai, Wangzheng Pan, Yichen Li, Zefei Mo, Hao Lou, Chufan Ren, Chenyue Dai, Xingbo Shan, Hui Ye, Zhenwei Xu, Pu Dong, Han Zhou, Shuya Xu, Tianye Zhu, Mingzhi Su, Xingguo Miao, Xiaoqu Hu, Liang Hong, Yi Wang, Feifei Su\",\"doi\":\"10.1111/jcmm.70497\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Opportunistic infections (OIs) are the leading cause of hospitalisation and mortality among Human Immunodeficiency Virus-infected (HIV-infected) patients. The diverse pathogen types and intricate clinical manifestations associated present a formidable challenge to the timely diagnosis of these infections. This study aims to use machine learning techniques to develop a diagnostic model that quickly identifies whether HIV-infected patients have any type of OIs, without being limited to specific infections, thus adapting to various clinical scenarios. This study is a retrospective cohort study that collected clinical data from HIV-infected patients at four healthcare organisations in China. A total of twelve machine learning classification algorithms were employed for the purposes of model training and evaluation. Additionally, feature reduction was conducted through the implementation of an importance ranking, with the objective of eliminating any redundant features. In conclusion, both the five features based on Shapley additive explanations (procalcitonin, haemoglobin, lymphocyte, creatinine, platelet) and the five features based on Permutation Importance explanations (procalcitonin, lymphocyte, haemoglobin, creatinine, indirect bilirubin) achieved the highest F1 score when evaluated using the adaptive boosting classifier model. The scores on the test set were 0.9016 and 0.9063, respectively, which significantly outperformed the best 32-feature model, gradient boosting classifier, which had a test set F1 score of 0.8991.</p>\",\"PeriodicalId\":101321,\"journal\":{\"name\":\"JOURNAL OF CELLULAR AND MOLECULAR MEDICINE\",\"volume\":\"29 6\",\"pages\":\"\"},\"PeriodicalIF\":5.3000,\"publicationDate\":\"2025-03-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1111/jcmm.70497\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JOURNAL OF CELLULAR AND MOLECULAR MEDICINE\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/jcmm.70497\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JOURNAL OF CELLULAR AND MOLECULAR MEDICINE","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/jcmm.70497","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Machine Learning Model for Diagnosing Opportunistic Infections in HIV Patients: Broad Applicability Across Infection Types
Opportunistic infections (OIs) are the leading cause of hospitalisation and mortality among Human Immunodeficiency Virus-infected (HIV-infected) patients. The diverse pathogen types and intricate clinical manifestations associated present a formidable challenge to the timely diagnosis of these infections. This study aims to use machine learning techniques to develop a diagnostic model that quickly identifies whether HIV-infected patients have any type of OIs, without being limited to specific infections, thus adapting to various clinical scenarios. This study is a retrospective cohort study that collected clinical data from HIV-infected patients at four healthcare organisations in China. A total of twelve machine learning classification algorithms were employed for the purposes of model training and evaluation. Additionally, feature reduction was conducted through the implementation of an importance ranking, with the objective of eliminating any redundant features. In conclusion, both the five features based on Shapley additive explanations (procalcitonin, haemoglobin, lymphocyte, creatinine, platelet) and the five features based on Permutation Importance explanations (procalcitonin, lymphocyte, haemoglobin, creatinine, indirect bilirubin) achieved the highest F1 score when evaluated using the adaptive boosting classifier model. The scores on the test set were 0.9016 and 0.9063, respectively, which significantly outperformed the best 32-feature model, gradient boosting classifier, which had a test set F1 score of 0.8991.
期刊介绍:
The Journal of Cellular and Molecular Medicine serves as a bridge between physiology and cellular medicine, as well as molecular biology and molecular therapeutics. With a 20-year history, the journal adopts an interdisciplinary approach to showcase innovative discoveries.
It publishes research aimed at advancing the collective understanding of the cellular and molecular mechanisms underlying diseases. The journal emphasizes translational studies that translate this knowledge into therapeutic strategies. Being fully open access, the journal is accessible to all readers.