Qianqian Zhao, Yijie Li, Chunliu Zhao, Ran Dong, Jiaxin Tian, Ze Zhang, Lin Huang, Jingwen Huang, Junhai Yan, Zhitao Yang, Jiangnan Ruan, Ping Wang, Li Yu, Jieming Qu, Min Zhou
{"title":"Integrating CT radiomics and clinical features using machine learning to predict post-COVID pulmonary fibrosis.","authors":"Qianqian Zhao, Yijie Li, Chunliu Zhao, Ran Dong, Jiaxin Tian, Ze Zhang, Lin Huang, Jingwen Huang, Junhai Yan, Zhitao Yang, Jiangnan Ruan, Ping Wang, Li Yu, Jieming Qu, Min Zhou","doi":"10.1186/s12931-025-03305-7","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The lack of reliable biomarkers for the early detection and risk stratification of post-COVID-19 pulmonary fibrosis (PCPF) underscores the urgency advanced predictive tools. This study aimed to develop a machine learning-based predictive model integrating quantitative CT (qCT) radiomics and clinical features to assess the risk of lung fibrosis in COVID-19 patients.</p><p><strong>Methods: </strong>A total of 204 patients with confirmed COVID-19 pneumonia were included in the study. Of these, 93 patients were assigned to the development cohort (74 for training and 19 for internal validation), while 111 patients from three independent hospitals constituted the external validation cohort. Chest CT images were analyzed using qCT software. Clinical data and laboratory parameters were obtained from electronic health records. Least absolute shrinkage and selection operator (LASSO) regression with 5-fold cross-validation was used to select the most predictive features. Twelve machine learning algorithms were independently trained. Their performances were evaluated by receiver operating characteristic (ROC) curves, area under the curve (AUC) values, sensitivity, and specificity.</p><p><strong>Results: </strong>Seventy-eight features were extracted and reduced to ten features for model development. These included two qCT radiomics signatures: (1) whole lung_reticulation (%) interstitial lung disease (ILD) texture analysis, (2) interstitial lung abnormality (ILA)_Num of lung zones ≥ 5%_whole lung_ILA. Among 12 machine learning algorithms evaluated, the support vector machine (SVM) model demonstrated the best predictive performance, with AUCs of 0.836 (95% CI: 0.830-0.842) in the training cohort, 0.796 (95% CI: 0.777-0.816) in the internal validation cohort, and 0.797 (95% CI: 0.691-0.873) in the external validation cohort.</p><p><strong>Conclusions: </strong>The integration of CT radiomics, clinical and laboratory variables using machine learning provides a robust tool for predicting pulmonary fibrosis progression in COVID-19 patients, facilitating early risk assessment and intervention.</p>","PeriodicalId":49131,"journal":{"name":"Respiratory Research","volume":"26 1","pages":"227"},"PeriodicalIF":5.8000,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12225148/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Respiratory Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12931-025-03305-7","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Medicine","Score":null,"Total":0}
引用次数: 0
Abstract
Background: The lack of reliable biomarkers for the early detection and risk stratification of post-COVID-19 pulmonary fibrosis (PCPF) underscores the urgency advanced predictive tools. This study aimed to develop a machine learning-based predictive model integrating quantitative CT (qCT) radiomics and clinical features to assess the risk of lung fibrosis in COVID-19 patients.
Methods: A total of 204 patients with confirmed COVID-19 pneumonia were included in the study. Of these, 93 patients were assigned to the development cohort (74 for training and 19 for internal validation), while 111 patients from three independent hospitals constituted the external validation cohort. Chest CT images were analyzed using qCT software. Clinical data and laboratory parameters were obtained from electronic health records. Least absolute shrinkage and selection operator (LASSO) regression with 5-fold cross-validation was used to select the most predictive features. Twelve machine learning algorithms were independently trained. Their performances were evaluated by receiver operating characteristic (ROC) curves, area under the curve (AUC) values, sensitivity, and specificity.
Results: Seventy-eight features were extracted and reduced to ten features for model development. These included two qCT radiomics signatures: (1) whole lung_reticulation (%) interstitial lung disease (ILD) texture analysis, (2) interstitial lung abnormality (ILA)_Num of lung zones ≥ 5%_whole lung_ILA. Among 12 machine learning algorithms evaluated, the support vector machine (SVM) model demonstrated the best predictive performance, with AUCs of 0.836 (95% CI: 0.830-0.842) in the training cohort, 0.796 (95% CI: 0.777-0.816) in the internal validation cohort, and 0.797 (95% CI: 0.691-0.873) in the external validation cohort.
Conclusions: The integration of CT radiomics, clinical and laboratory variables using machine learning provides a robust tool for predicting pulmonary fibrosis progression in COVID-19 patients, facilitating early risk assessment and intervention.
期刊介绍:
Respiratory Research publishes high-quality clinical and basic research, review and commentary articles on all aspects of respiratory medicine and related diseases.
As the leading fully open access journal in the field, Respiratory Research provides an essential resource for pulmonologists, allergists, immunologists and other physicians, researchers, healthcare workers and medical students with worldwide dissemination of articles resulting in high visibility and generating international discussion.
Topics of specific interest include asthma, chronic obstructive pulmonary disease, cystic fibrosis, genetics, infectious diseases, interstitial lung diseases, lung development, lung tumors, occupational and environmental factors, pulmonary circulation, pulmonary pharmacology and therapeutics, respiratory immunology, respiratory physiology, and sleep-related respiratory problems.