Zhenguang Chen, Minhua Peng, Pengnan Fan, Sai Chen, Xinxin Cheng, Bo Xu, Ruiping Chen, Xiao Hu, Wei Wei, Tingting Zhao, Jun Kong, Weiliang Liang, Xiangcheng Qiu, Sitong Chen, Junqi Wang
{"title":"Machine learning assisted breathomic approach for early-stage thoracic cancer detection.","authors":"Zhenguang Chen, Minhua Peng, Pengnan Fan, Sai Chen, Xinxin Cheng, Bo Xu, Ruiping Chen, Xiao Hu, Wei Wei, Tingting Zhao, Jun Kong, Weiliang Liang, Xiangcheng Qiu, Sitong Chen, Junqi Wang","doi":"10.3389/fonc.2025.1635280","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>This study explores the feasibility of using breathomic biomarkers analyzed by machine learning as a non-invasive diagnostic tool to differentiate between benign and malignant thoracic lesions, aiming to enhance early detection of thoracic cancers and inform clinical decision-making.</p><p><strong>Methods: </strong>This study enrolled 132 participants with confirmed diagnosis of lung cancer, esophageal cancer, thymoma, and benign diseases. Exhaled breath samples were analyzed by thermal desorption-gas chromatography-mass spectrometry. A logistic regression algorithm was employed to construct a classification model for benign and malignant thoracic lesions. This model was trained on a subset of 80 cases and subsequently validated in a separate set comprising 52 samples.</p><p><strong>Results: </strong>A logistic regression model based on thirteen exhaled volatile organic compounds (VOCs) was developed to differentiate benign and malignant thoracic lesions. The 13-VOC model achieved an AUC of 0.85 (0.72, 0.96), accuracy of 0.79 (0.66, 0.88), sensitivity of 0.82 (0.67, 0.91), and a specificity of 0.71 (0.45, 0.88). It correctly classified 80% of lung cancer, 80% of thymoma, and 100% of esophageal cancer cases, distinguishing 71.4% of benign lesions. For lung cancer, the model achieved an AUC of 0.79 (0.57, 0.98), sensitivity of 0.80 (0.63, 0.91), and specificity of 0.63 (0.31, 0.86), with 81.8% accuracy in detecting early-stage (Stage 0 + I + II) disease. The model outperformed a 4-serum tumor marker panel in sensitivity (0.90 vs. 0.39, <i>p</i> < 0.001). Additionally, in a cohort of 58 cancer patients, model-predicted risk significantly decreased post-surgery (<i>p</i> < 0.01), indicating a strong correlation with disease burden reduction.</p><p><strong>Conclusion: </strong>This study demonstrates the feasibility of utilizing breathomics biomarkers for developing a non-invasive machine learning model for the early diagnosis of thoracic malignancies. These findings provide a foundation for breath analysis as a promising tool for early cancer detection, potentially facilitating improved clinical decision-making and enhancing patient outcomes.</p>","PeriodicalId":12482,"journal":{"name":"Frontiers in Oncology","volume":"15 ","pages":"1635280"},"PeriodicalIF":3.5000,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12483886/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Oncology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3389/fonc.2025.1635280","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Objective: This study explores the feasibility of using breathomic biomarkers analyzed by machine learning as a non-invasive diagnostic tool to differentiate between benign and malignant thoracic lesions, aiming to enhance early detection of thoracic cancers and inform clinical decision-making.
Methods: This study enrolled 132 participants with confirmed diagnosis of lung cancer, esophageal cancer, thymoma, and benign diseases. Exhaled breath samples were analyzed by thermal desorption-gas chromatography-mass spectrometry. A logistic regression algorithm was employed to construct a classification model for benign and malignant thoracic lesions. This model was trained on a subset of 80 cases and subsequently validated in a separate set comprising 52 samples.
Results: A logistic regression model based on thirteen exhaled volatile organic compounds (VOCs) was developed to differentiate benign and malignant thoracic lesions. The 13-VOC model achieved an AUC of 0.85 (0.72, 0.96), accuracy of 0.79 (0.66, 0.88), sensitivity of 0.82 (0.67, 0.91), and a specificity of 0.71 (0.45, 0.88). It correctly classified 80% of lung cancer, 80% of thymoma, and 100% of esophageal cancer cases, distinguishing 71.4% of benign lesions. For lung cancer, the model achieved an AUC of 0.79 (0.57, 0.98), sensitivity of 0.80 (0.63, 0.91), and specificity of 0.63 (0.31, 0.86), with 81.8% accuracy in detecting early-stage (Stage 0 + I + II) disease. The model outperformed a 4-serum tumor marker panel in sensitivity (0.90 vs. 0.39, p < 0.001). Additionally, in a cohort of 58 cancer patients, model-predicted risk significantly decreased post-surgery (p < 0.01), indicating a strong correlation with disease burden reduction.
Conclusion: This study demonstrates the feasibility of utilizing breathomics biomarkers for developing a non-invasive machine learning model for the early diagnosis of thoracic malignancies. These findings provide a foundation for breath analysis as a promising tool for early cancer detection, potentially facilitating improved clinical decision-making and enhancing patient outcomes.
期刊介绍:
Cancer Imaging and Diagnosis is dedicated to the publication of results from clinical and research studies applied to cancer diagnosis and treatment. The section aims to publish studies from the entire field of cancer imaging: results from routine use of clinical imaging in both radiology and nuclear medicine, results from clinical trials, experimental molecular imaging in humans and small animals, research on new contrast agents in CT, MRI, ultrasound, publication of new technical applications and processing algorithms to improve the standardization of quantitative imaging and image guided interventions for the diagnosis and treatment of cancer.