A retrospective study differentiating nontuberculous mycobacterial pulmonary disease from pulmonary tuberculosis on computed tomography using radiomics and machine learning algorithms.

IF 4.3 2区医学 Q1 MEDICINE, GENERAL & INTERNAL

Annals of medicine Pub Date : 2024-09-16 DOI:10.1080/07853890.2024.2401613

Lihong Zhou,Yiwen Wang,Wenchao Zhu,Yafang Zhao,Yihang Yu,Qin Hu,Wenke Yu

{"title":"A retrospective study differentiating nontuberculous mycobacterial pulmonary disease from pulmonary tuberculosis on computed tomography using radiomics and machine learning algorithms.","authors":"Lihong Zhou,Yiwen Wang,Wenchao Zhu,Yafang Zhao,Yihang Yu,Qin Hu,Wenke Yu","doi":"10.1080/07853890.2024.2401613","DOIUrl":null,"url":null,"abstract":"OBJECTIVE\r\nTo evaluate the effectiveness of a machine learning based on computed tomography (CT) radiomics to distinguish nontuberculous mycobacterial pulmonary disease (NTM-PD) from pulmonary tuberculosis (PTB).\r\n\r\nMETHODS\r\nIn this retrospective analysis, medical records of 99 individuals afflicted with NTM-PD and 285 individuals with PTB in Zhejiang Chinese and Western Medicine Integrated Hospital were examined. Random numbers generated by a computer were utilized to stratify the study cohort, with 80% designated as the training cohort and 20% as the validation cohort. A total of 2153 radiomics features were extracted using Python (Pyradiomics package) to analyse the CT characteristics of the large disease areas. The identification of significant factors was conducted through the least absolute shrinkage and selection operator (LASSO) regression. The following four supervised learning classifier models were developed: random forest (RF), support vector machine (SVM), logistic regression (LR), and extreme gradient boosting (XGBoost). For assessment and comparison of the predictive performance among these models, receiver-operating characteristic (ROC) curves and the areas under the ROC curves (AUCs) were employed.\r\n\r\nRESULTS\r\nThe Student's t-test, Levene test, and LASSO algorithm collectively selected 23 optimal features. ROC analysis was then conducted, with the respective AUC values of the XGBoost, LR, SVM, and RF models recorded to be 1, 0.9044, 0.8868, and 0.7982 in the training cohort. In the validation cohort, the respective AUC values of the XGBoost, LR, SVM, and RF models were 0.8358, 0.8085, 0.87739, and 0.7759. The DeLong test results noted the lack of remarkable variation across the models.\r\n\r\nCONCLUSION\r\nThe CT radiomics features can help distinguish between NTM-PD and PTB. Among the four classifiers, SVM showed a stable performance in effectively identifying these two diseases.","PeriodicalId":8371,"journal":{"name":"Annals of medicine","volume":"39 1","pages":"2401613"},"PeriodicalIF":4.3000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1080/07853890.2024.2401613","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}

引用次数: 0

Abstract

OBJECTIVE To evaluate the effectiveness of a machine learning based on computed tomography (CT) radiomics to distinguish nontuberculous mycobacterial pulmonary disease (NTM-PD) from pulmonary tuberculosis (PTB). METHODS In this retrospective analysis, medical records of 99 individuals afflicted with NTM-PD and 285 individuals with PTB in Zhejiang Chinese and Western Medicine Integrated Hospital were examined. Random numbers generated by a computer were utilized to stratify the study cohort, with 80% designated as the training cohort and 20% as the validation cohort. A total of 2153 radiomics features were extracted using Python (Pyradiomics package) to analyse the CT characteristics of the large disease areas. The identification of significant factors was conducted through the least absolute shrinkage and selection operator (LASSO) regression. The following four supervised learning classifier models were developed: random forest (RF), support vector machine (SVM), logistic regression (LR), and extreme gradient boosting (XGBoost). For assessment and comparison of the predictive performance among these models, receiver-operating characteristic (ROC) curves and the areas under the ROC curves (AUCs) were employed. RESULTS The Student's t-test, Levene test, and LASSO algorithm collectively selected 23 optimal features. ROC analysis was then conducted, with the respective AUC values of the XGBoost, LR, SVM, and RF models recorded to be 1, 0.9044, 0.8868, and 0.7982 in the training cohort. In the validation cohort, the respective AUC values of the XGBoost, LR, SVM, and RF models were 0.8358, 0.8085, 0.87739, and 0.7759. The DeLong test results noted the lack of remarkable variation across the models. CONCLUSION The CT radiomics features can help distinguish between NTM-PD and PTB. Among the four classifiers, SVM showed a stable performance in effectively identifying these two diseases.

查看原文本刊更多论文

利用放射组学和机器学习算法对计算机断层扫描上的非结核分枝杆菌肺病和肺结核进行区分的回顾性研究。

目的评估基于计算机断层扫描（CT）放射组学的机器学习在区分非结核分枝杆菌肺病（NTM-PD）和肺结核（PTB）方面的有效性。方法在这项回顾性分析中，研究人员查阅了浙江省中西医结合医院99名非结核分枝杆菌肺病患者和285名肺结核患者的病历。利用计算机生成的随机数对研究队列进行分层，其中 80% 为训练队列，20% 为验证队列。使用 Python（Pyradiomics 软件包）共提取了 2153 个放射组学特征，以分析大病区的 CT 特征。通过最小绝对收缩和选择算子（LASSO）回归法识别重要因素。开发了以下四种监督学习分类器模型：随机森林（RF）、支持向量机（SVM）、逻辑回归（LR）和极梯度提升（XGBoost）。结果学生 t 检验、Levene 检验和 LASSO 算法共同选出了 23 个最佳特征。然后进行了 ROC 分析，在训练队列中，XGBoost、LR、SVM 和 RF 模型的 AUC 值分别为 1、0.9044、0.8868 和 0.7982。在验证队列中，XGBoost、LR、SVM 和 RF 模型的 AUC 值分别为 0.8358、0.8085、0.87739 和 0.7759。DeLong 检验结果表明，各模型之间缺乏显著差异。在四种分类器中，SVM 在有效识别这两种疾病方面表现稳定。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Annals of medicine 医学-医学：内科

CiteScore

4.90

自引率

0.00%

发文量

292

审稿时长

3 months

期刊介绍： Annals of Medicine is one of the world’s leading general medical review journals, boasting an impact factor of 5.435. It presents high-quality topical review articles, commissioned by the Editors and Editorial Committee, as well as original articles. The journal provides the current opinion on recent developments across the major medical specialties, with a particular focus on internal medicine. The peer-reviewed content of the journal keeps readers updated on the latest advances in the understanding of the pathogenesis of diseases, and in how molecular medicine and genetics can be applied in daily clinical practice.