Thymoma habitat segmentation and risk prediction model using CT imaging and K-means clustering.

Medical physics Pub Date : 2025-05-19 DOI:10.1002/mp.17892
Zhu Liang, Jiamin Li, Shuyan He, Siyuan Li, Runzhi Cai, Chunyuan Chen, Yan Zhang, Biao Deng, Yanxia Wu
{"title":"Thymoma habitat segmentation and risk prediction model using CT imaging and K-means clustering.","authors":"Zhu Liang, Jiamin Li, Shuyan He, Siyuan Li, Runzhi Cai, Chunyuan Chen, Yan Zhang, Biao Deng, Yanxia Wu","doi":"10.1002/mp.17892","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Thymomas, though rare, present a wide range of clinical behaviors, from indolent to aggressive forms, making accurate risk stratification crucial for treatment planning. Traditional methods such as histopathology and radiological assessments often lack the ability to capture tumor heterogeneity, which can impact prognosis. Radiomics, combined with machine learning, provides a method to extract and analyze quantitative imaging features, offering the potential to improve tumor classification and risk prediction. By segmenting tumors into distinct habitat zones, it becomes possible to assess intratumoral heterogeneity more effectively. This study employs radiomics and machine learning techniques to enhance thymoma risk prediction, aiming to improve diagnostic consistency and reduce variability in radiologists' assessments.</p><p><strong>Objective: </strong>This study aims to identify different habitat zones within thymomas through CT imaging feature analysis and to establish a predictive model to differentiate between high and low-risk thymomas. Additionally, the study explores how this model can assist radiologists.</p><p><strong>Methods: </strong>We obtained CT imaging data from 133 patients with thymoma who were treated at the Affiliated Hospital of Guangdong Medical University from 2015 to 2023. Images from the plain scan phase, venous phase, arterial phase, and their differential images (subtracted images) were used. Tumor regions were segmented into three habitat zones using K-Means clustering. Imaging features from each habitat zone were extracted using the PyRadiomics (van Griethuysen, 2017) library. The 28 most distinguishing features were selected through Mann-Whitney U tests (Mann, 1947) and Spearman's correlation analysis (Spearman, 1904). Five predictive models were built using the same machine learning algorithm (Support Vector Machine [SVM]): Habitat1, Habitat2, Habitat3 (trained on features from individual tumor habitat regions), Habitat All (trained on combined features from all regions), and Intra (trained on intratumoral features), and their performances were evaluated for comparison. The models' diagnostic outcomes were compared with the diagnoses of four radiologists (two junior and two experienced physicians).</p><p><strong>Results: </strong>The AUC (area under curve) for habitat zone 1 was 0.818, for habitat zone 2 was 0.732, and for habitat zone 3 was 0.763. The comprehensive model, which combined data from all habitat zones, achieved an AUC of 0.960, outperforming the model based on traditional radiomic features (AUC of 0.720). The model significantly improved the diagnostic accuracy of all four radiologists. The AUCs for junior radiologists 1 and 2 increased from 0.747 and 0.775 to 0.932 and 0.972, respectively, while for experienced radiologists 1 and 2, the AUCs increased from 0.932 and 0.859 to 0.977 and 0.972, respectively.</p><p><strong>Conclusion: </strong>This study successfully identified distinct habitat zones within thymomas through CT imaging feature analysis and developed an efficient predictive model that significantly improved diagnostic accuracy. This model offers a novel tool for risk assessment of thymomas and can aid in guiding clinical decision-making.</p>","PeriodicalId":94136,"journal":{"name":"Medical physics","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical physics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/mp.17892","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Thymomas, though rare, present a wide range of clinical behaviors, from indolent to aggressive forms, making accurate risk stratification crucial for treatment planning. Traditional methods such as histopathology and radiological assessments often lack the ability to capture tumor heterogeneity, which can impact prognosis. Radiomics, combined with machine learning, provides a method to extract and analyze quantitative imaging features, offering the potential to improve tumor classification and risk prediction. By segmenting tumors into distinct habitat zones, it becomes possible to assess intratumoral heterogeneity more effectively. This study employs radiomics and machine learning techniques to enhance thymoma risk prediction, aiming to improve diagnostic consistency and reduce variability in radiologists' assessments.

Objective: This study aims to identify different habitat zones within thymomas through CT imaging feature analysis and to establish a predictive model to differentiate between high and low-risk thymomas. Additionally, the study explores how this model can assist radiologists.

Methods: We obtained CT imaging data from 133 patients with thymoma who were treated at the Affiliated Hospital of Guangdong Medical University from 2015 to 2023. Images from the plain scan phase, venous phase, arterial phase, and their differential images (subtracted images) were used. Tumor regions were segmented into three habitat zones using K-Means clustering. Imaging features from each habitat zone were extracted using the PyRadiomics (van Griethuysen, 2017) library. The 28 most distinguishing features were selected through Mann-Whitney U tests (Mann, 1947) and Spearman's correlation analysis (Spearman, 1904). Five predictive models were built using the same machine learning algorithm (Support Vector Machine [SVM]): Habitat1, Habitat2, Habitat3 (trained on features from individual tumor habitat regions), Habitat All (trained on combined features from all regions), and Intra (trained on intratumoral features), and their performances were evaluated for comparison. The models' diagnostic outcomes were compared with the diagnoses of four radiologists (two junior and two experienced physicians).

Results: The AUC (area under curve) for habitat zone 1 was 0.818, for habitat zone 2 was 0.732, and for habitat zone 3 was 0.763. The comprehensive model, which combined data from all habitat zones, achieved an AUC of 0.960, outperforming the model based on traditional radiomic features (AUC of 0.720). The model significantly improved the diagnostic accuracy of all four radiologists. The AUCs for junior radiologists 1 and 2 increased from 0.747 and 0.775 to 0.932 and 0.972, respectively, while for experienced radiologists 1 and 2, the AUCs increased from 0.932 and 0.859 to 0.977 and 0.972, respectively.

Conclusion: This study successfully identified distinct habitat zones within thymomas through CT imaging feature analysis and developed an efficient predictive model that significantly improved diagnostic accuracy. This model offers a novel tool for risk assessment of thymomas and can aid in guiding clinical decision-making.

基于CT成像和k均值聚类的胸腺瘤栖息地分割及风险预测模型。
背景:胸腺瘤虽然罕见,但临床表现广泛,从惰性到侵袭性,因此准确的风险分层对治疗计划至关重要。传统的方法,如组织病理学和放射学评估往往缺乏捕捉肿瘤异质性的能力,这可能会影响预后。放射组学与机器学习相结合,提供了一种提取和分析定量成像特征的方法,为改善肿瘤分类和风险预测提供了潜力。通过将肿瘤分割成不同的栖息地区,可以更有效地评估肿瘤内的异质性。本研究采用放射组学和机器学习技术来增强胸腺瘤风险预测,旨在提高诊断一致性并减少放射科医生评估的可变性。目的:本研究旨在通过CT影像特征分析,识别胸腺瘤内部不同的栖息区,建立胸腺瘤高、低危性的预测模型。此外,该研究还探讨了该模型如何帮助放射科医生。方法:收集2015年至2023年广东医科大学附属医院收治的133例胸腺瘤患者的CT影像资料。使用平扫期、静脉期、动脉期的图像及其差值图像(减去图像)。采用K-Means聚类方法将肿瘤区域划分为3个生境区。使用PyRadiomics (van Griethuysen, 2017)库提取每个栖息地带的成像特征。通过Mann- whitney U检验(Mann, 1947)和Spearman的相关分析(Spearman, 1904)选择了28个最显著的特征。使用相同的机器学习算法(支持向量机[SVM])建立了五个预测模型:Habitat1、Habitat2、Habitat3(对单个肿瘤栖息地区域的特征进行训练)、habitat All(对所有区域的组合特征进行训练)和Intra(对肿瘤内部特征进行训练),并对它们的性能进行比较。将模型的诊断结果与四名放射科医生(两名初级和两名有经验的医生)的诊断结果进行比较。结果:生境1的曲线下面积为0.818,生境2的曲线下面积为0.732,生境3的曲线下面积为0.763。综合模型的AUC为0.960,优于传统的辐射特征模型(AUC为0.720)。该模型显著提高了所有四位放射科医生的诊断准确性。初级放射科医师1和2的auc分别由0.747和0.775上升至0.932和0.972,资深放射科医师1和2的auc分别由0.932和0.859上升至0.977和0.972。结论:本研究通过CT影像特征分析,成功识别出胸腺瘤内不同的栖息区,并建立了有效的预测模型,显著提高了胸腺瘤的诊断准确率。该模型为胸腺瘤的风险评估提供了一种新的工具,可以帮助指导临床决策。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信