Random forest-based predictor selection and pneumonia risk probability assessment in acute respiratory infections: A cross-sectional study in Chongqing, China, 2023–2024
IF 3 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH
Yunshao Xu , Yuping Duan , Jule Yang , Mingyue Jiang , Yanxia Sun , Yanlin Cao , Li Qi , Zeni Wu , Luzhao Feng
{"title":"Random forest-based predictor selection and pneumonia risk probability assessment in acute respiratory infections: A cross-sectional study in Chongqing, China, 2023–2024","authors":"Yunshao Xu , Yuping Duan , Jule Yang , Mingyue Jiang , Yanxia Sun , Yanlin Cao , Li Qi , Zeni Wu , Luzhao Feng","doi":"10.1016/j.bsheal.2025.07.004","DOIUrl":null,"url":null,"abstract":"<div><div>Progression of acute respiratory infection (ARI) to pneumonia increases severity and healthcare burden. Limited evidence exists on using machine learning to identify predictors from demographics, clinical, and pathogen detection data. This study aimed to identify pneumonia predictors in ARI patients using machine learning methods. This observational study was conducted in Chongqing, China, from September 2023 to April 2024. Outpatients and inpatients with ARI were recruited weekly. A random forest algorithm was used for predictor selection, followed by a logistic regression-based nomogram to analyze the probability of pneumonia. Among the 1,638 patients with ARI, those with pneumonia had higher rates of influenza A virus (IFV-A) (49.2 % vs. 39.6 %), influenza B virus (26.3 % vs. 18.6 %), and respiratory syncytial virus (6.1 % vs. 1.9 %) infection than those without pneumonia. In the subgroup of 79 patients with comprehensive blood tests, pneumonia was positively associated with hemoglobin (130.00 g/L vs. 124.00 g/L), blood urea nitrogen (5.73 mmol/L vs. 4.85 mmol/L), C-reactive protein (36.10 mg/L vs. 25.25 mg/L), procalcitonin (0.11 μg/L vs. 0.07 μg/L), and D-dimer (0.95 μg/L vs. 0.80 μg/L) levels, whereas pneumonia was inversely associated with neutrophils (4.20 × 10<sup>9</sup>/L vs. 4.76 × 10<sup>9</sup>/L), aspartate aminotransferase (22.50 U/L vs. 24.00 U/L), and uric acid (280.90 μmol/L vs. 330.00 μmol/L) levels. Elevated D-dimer levels (adjusted odds ratio [aOR] = 1.002, 95 % confidence interval [CI]: 1.001–1.004) and IFV-A infection (aOR = 9.308, 95 % CI: 2.433–35.606) were significantly associated with increased pneumonia probability. In future clinical practice, particular attention should be given to ARI patients with elevated D-dimer levels and IFV-A infections.</div></div>","PeriodicalId":36178,"journal":{"name":"Biosafety and Health","volume":"7 4","pages":"Pages 238-244"},"PeriodicalIF":3.0000,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biosafety and Health","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2590053625000977","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0
Abstract
Progression of acute respiratory infection (ARI) to pneumonia increases severity and healthcare burden. Limited evidence exists on using machine learning to identify predictors from demographics, clinical, and pathogen detection data. This study aimed to identify pneumonia predictors in ARI patients using machine learning methods. This observational study was conducted in Chongqing, China, from September 2023 to April 2024. Outpatients and inpatients with ARI were recruited weekly. A random forest algorithm was used for predictor selection, followed by a logistic regression-based nomogram to analyze the probability of pneumonia. Among the 1,638 patients with ARI, those with pneumonia had higher rates of influenza A virus (IFV-A) (49.2 % vs. 39.6 %), influenza B virus (26.3 % vs. 18.6 %), and respiratory syncytial virus (6.1 % vs. 1.9 %) infection than those without pneumonia. In the subgroup of 79 patients with comprehensive blood tests, pneumonia was positively associated with hemoglobin (130.00 g/L vs. 124.00 g/L), blood urea nitrogen (5.73 mmol/L vs. 4.85 mmol/L), C-reactive protein (36.10 mg/L vs. 25.25 mg/L), procalcitonin (0.11 μg/L vs. 0.07 μg/L), and D-dimer (0.95 μg/L vs. 0.80 μg/L) levels, whereas pneumonia was inversely associated with neutrophils (4.20 × 109/L vs. 4.76 × 109/L), aspartate aminotransferase (22.50 U/L vs. 24.00 U/L), and uric acid (280.90 μmol/L vs. 330.00 μmol/L) levels. Elevated D-dimer levels (adjusted odds ratio [aOR] = 1.002, 95 % confidence interval [CI]: 1.001–1.004) and IFV-A infection (aOR = 9.308, 95 % CI: 2.433–35.606) were significantly associated with increased pneumonia probability. In future clinical practice, particular attention should be given to ARI patients with elevated D-dimer levels and IFV-A infections.