Machine learning-assisted classification of lung cancer: the role of sarcopenia, inflammatory biomarkers, and PET/CT anatomical-metabolic parameters.

IF 2 4区医学 Q3 ENGINEERING, BIOMEDICAL

Physical and Engineering Sciences in Medicine Pub Date : 2025-10-06 DOI:10.1007/s13246-025-01650-x

Handan Tanyildizi-Kokkulunk, Goksel Alcin, Iffet Cavdar, Resit Akyel, Safak Yigit, Tuba Ciftci-Kusbeci, Gonul Caliskan

{"title":"Machine learning-assisted classification of lung cancer: the role of sarcopenia, inflammatory biomarkers, and PET/CT anatomical-metabolic parameters.","authors":"Handan Tanyildizi-Kokkulunk, Goksel Alcin, Iffet Cavdar, Resit Akyel, Safak Yigit, Tuba Ciftci-Kusbeci, Gonul Caliskan","doi":"10.1007/s13246-025-01650-x","DOIUrl":null,"url":null,"abstract":"<p><p>Accurate differentiation between non-cancerous, benign, and malignant lung cancer remains a diagnostic challenge due to overlapping clinical and imaging characteristics. This study proposes a multimodal machine learning (ML) framework integrating positron emission tomography/computed tomography (PET/CT) anatomic-metabolic parameters, sarcopenia markers, and inflammatory biomarkers to enhance classification performance in lung cancer. A retrospective dataset of 222 patients was analyzed, including demographic variables, functional and morphometric sarcopenia indices, hematological inflammation markers, and PET/CT derived parameters such as maximum and mean standardized uptake value (SUVmax, SUVmean), metabolic tumor volume (MTV), total lesion glycolysis (TLG). Five ML algorithms-Logistic Regression, Multi-Layer Perceptron, Support Vector Machine, Extreme Gradient Boosting, and Random Forest-were evaluated using standardized performance metrics. Synthetic Minority Oversampling Technique was applied to balance class distributions. Feature importance analysis was conducted using the optimal model, and classification was repeated using the top 15 features. Among the models, Random Forest demonstrated superior predictive performance with a test accuracy of 96%, precision, recall, and F1-score of 0.96, and an average AUC of 0.99. Feature importance analysis revealed SUVmax, SUVmean, total lesion glycolysis, and skeletal muscle index as leading predictors. A secondary classification using only the top 15 features yielded even higher test accuracy (97%). These findings underscore the potential of integrating metabolic imaging, physical function, and biochemical inflammation markers in a non-invasive ML-based diagnostic pipeline. The proposed framework demonstrates high accuracy and generalizability and may serve as an effective clinical decision support tool in early lung cancer diagnosis and risk stratification.</p>","PeriodicalId":48490,"journal":{"name":"Physical and Engineering Sciences in Medicine","volume":" ","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physical and Engineering Sciences in Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s13246-025-01650-x","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Accurate differentiation between non-cancerous, benign, and malignant lung cancer remains a diagnostic challenge due to overlapping clinical and imaging characteristics. This study proposes a multimodal machine learning (ML) framework integrating positron emission tomography/computed tomography (PET/CT) anatomic-metabolic parameters, sarcopenia markers, and inflammatory biomarkers to enhance classification performance in lung cancer. A retrospective dataset of 222 patients was analyzed, including demographic variables, functional and morphometric sarcopenia indices, hematological inflammation markers, and PET/CT derived parameters such as maximum and mean standardized uptake value (SUVmax, SUVmean), metabolic tumor volume (MTV), total lesion glycolysis (TLG). Five ML algorithms-Logistic Regression, Multi-Layer Perceptron, Support Vector Machine, Extreme Gradient Boosting, and Random Forest-were evaluated using standardized performance metrics. Synthetic Minority Oversampling Technique was applied to balance class distributions. Feature importance analysis was conducted using the optimal model, and classification was repeated using the top 15 features. Among the models, Random Forest demonstrated superior predictive performance with a test accuracy of 96%, precision, recall, and F1-score of 0.96, and an average AUC of 0.99. Feature importance analysis revealed SUVmax, SUVmean, total lesion glycolysis, and skeletal muscle index as leading predictors. A secondary classification using only the top 15 features yielded even higher test accuracy (97%). These findings underscore the potential of integrating metabolic imaging, physical function, and biochemical inflammation markers in a non-invasive ML-based diagnostic pipeline. The proposed framework demonstrates high accuracy and generalizability and may serve as an effective clinical decision support tool in early lung cancer diagnosis and risk stratification.

查看原文本刊更多论文

机器学习辅助肺癌分类：肌肉减少症、炎症生物标志物和PET/CT解剖代谢参数的作用。

由于临床和影像学特征重叠，准确区分非癌性、良性和恶性肺癌仍然是一个诊断挑战。本研究提出了一个整合正电子发射断层扫描/计算机断层扫描（PET/CT）解剖代谢参数、肌肉减少标志物和炎症生物标志物的多模态机器学习（ML）框架，以提高肺癌的分类性能。对222例患者的回顾性数据集进行分析，包括人口统计学变量、功能和形态测量性肌肉减少症指数、血液学炎症标志物和PET/CT衍生参数，如最大和平均标准化摄取值（SUVmax, SUVmean）、代谢肿瘤体积（MTV）、病变总糖酵解（TLG）。五种机器学习算法——逻辑回归、多层感知机、支持向量机、极端梯度增强和随机森林——使用标准化的性能指标进行评估。采用合成少数派过采样技术平衡类分布。利用最优模型进行特征重要性分析，利用前15个特征重复分类。其中Random Forest模型的预测准确率为96%，精密度、召回率和f1得分为0.96，平均AUC为0.99。特征重要性分析显示SUVmax、SUVmean、病变糖酵解总量和骨骼肌指数是主要预测因子。仅使用前15个特征的二次分类产生了更高的测试准确率（97%）。这些发现强调了将代谢成像、身体功能和生化炎症标志物整合到无创的基于ml的诊断管道中的潜力。该框架具有较高的准确性和通用性，可作为早期肺癌诊断和风险分层的有效临床决策支持工具。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Physical and Engineering Sciences in Medicine Multiple-

CiteScore

8.40

自引率

4.50%

发文量

110