Development of a prognostic model for diagnosis of prostate cancer based on radiomics of biparametric magnetic resonance imaging apparent diffusion coefficient maps and stacking of machine learning algorithms

A. I. Kuznetsov
{"title":"Development of a prognostic model for diagnosis of prostate cancer based on radiomics of biparametric magnetic resonance imaging apparent diffusion coefficient maps and stacking of machine learning algorithms","authors":"A. I. Kuznetsov","doi":"10.17816/dd626145","DOIUrl":null,"url":null,"abstract":"BACKGROUND: Prostate cancer is one of the most common cancers among men [1, 2]. In recent years, a number of prognostic models based on texture analysis of biparametric magnetic resonance images have been created. The research has shown that radiomics features extracted from apparent diffusion coefficient maps are the most reproducible [3]. However, the models were limited in accuracy, since they are built using a single machine learning algorithm, which takes into account only linear dependences [4–6]. \nAIM: Increasing the accuracy of a prognostic model diagnosing prostate cancer through the use of stacking machine learning algorithms that takes into account not only linear, but also nonlinear dependencies based on radiomics of biparametric magnetic resonance imaging apparent diffusion coefficient maps. \nMATERIALS AND METHODS: A single-center cohort retrospective study of patients with suspected prostate cancer was conducted in the X-ray Diagnostics and Tomography Department of the United Hospital and Polyclinic (Moscow, Russia) from 2017 to 2023. The presence of prostate cancer was confirmed by biopsy or radical prostatectomy. Statistical analyses was performed using Python 3.11. \nRESULTS: The study involved 67 men aged 60 [54; 66] years, of which 57 were diagnosed with prostate cancer, and 10 — with benign prostate formation. The LIFEx software identified 96 radiomic features. \nStatistically significant differences were found for: PARAMS_ZSpatialResampling (the voxel size of the image: Z dimension) (p=0.001), SHAPE_Sphericity[onlyFor3DROI] (how spherical a Volume of Interest is) (p=0.006), SHAPE_Compacity[onlyFor3DROI] (how compact the Volume of Interest is) (p=0.004), GLRLM_HGRE (p=0.039), GLRLM_SRHGE (p=0.041), GLRLM_RLNU (p=0.039), where GLRLM — Grey-Level Run Length Matrix. Univariate logistic regression showed that SHAPE_Compacity[onlyFor3DROI] (R2=15%) and PARAMS_ZSpatialResampling (R2=18%) had a statistically significant effect on the outcome. First, using the multivariate logistic regression method, a prognostic model was built that takes into account only linear dependencies. The model includes 3 features that together have a statistically significant effect on the outcome (R2=23%): SHAPE_Sphericity[onlyFor3DROI], PARAMS_ZSpatialResampling and GLRLM_RLNU. \nTo describe nonlinear relationships, another model was built based on the “Decision Tree” algorithm. It included 4 indicators (R2=58%): DISCRETIZED_HISTO_Entropy_log10 (the randomness of the distribution), SHAPE_Sphericity[onlyFor3DROI], PARAMS_ZSpatialResampling and GLRLM_SRE. \nStacking of algorithms, which consists of calculating the arithmetic mean between the predictions of the multivariate logistic regression and “Decision Tree” algorithms, made it possible to construct a model that takes into account both linear and nonlinear dependencies. The model includes 5 features (R2=77%). The constructed model formed the basis of the developed calculator program [7], currently introduced into a radiology practice. \nCONCLUSION: The new model built on the basis of apparent diffusion coefficient maps performs better (area under ROC-curve 99.0% [97.7; 100.0]) than the existing models with area under ROC-curve 83.6% [78.3; 88.9], which also show high heterogeneity (I2=71%). The accuracy of the new model was increased due to the use of stacking machine learning technologies, which made it possible to take into account both linear and nonlinear effects from features on the outcome.","PeriodicalId":34831,"journal":{"name":"Digital Diagnostics","volume":"57 4","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Diagnostics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.17816/dd626145","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

BACKGROUND: Prostate cancer is one of the most common cancers among men [1, 2]. In recent years, a number of prognostic models based on texture analysis of biparametric magnetic resonance images have been created. The research has shown that radiomics features extracted from apparent diffusion coefficient maps are the most reproducible [3]. However, the models were limited in accuracy, since they are built using a single machine learning algorithm, which takes into account only linear dependences [4–6]. AIM: Increasing the accuracy of a prognostic model diagnosing prostate cancer through the use of stacking machine learning algorithms that takes into account not only linear, but also nonlinear dependencies based on radiomics of biparametric magnetic resonance imaging apparent diffusion coefficient maps. MATERIALS AND METHODS: A single-center cohort retrospective study of patients with suspected prostate cancer was conducted in the X-ray Diagnostics and Tomography Department of the United Hospital and Polyclinic (Moscow, Russia) from 2017 to 2023. The presence of prostate cancer was confirmed by biopsy or radical prostatectomy. Statistical analyses was performed using Python 3.11. RESULTS: The study involved 67 men aged 60 [54; 66] years, of which 57 were diagnosed with prostate cancer, and 10 — with benign prostate formation. The LIFEx software identified 96 radiomic features. Statistically significant differences were found for: PARAMS_ZSpatialResampling (the voxel size of the image: Z dimension) (p=0.001), SHAPE_Sphericity[onlyFor3DROI] (how spherical a Volume of Interest is) (p=0.006), SHAPE_Compacity[onlyFor3DROI] (how compact the Volume of Interest is) (p=0.004), GLRLM_HGRE (p=0.039), GLRLM_SRHGE (p=0.041), GLRLM_RLNU (p=0.039), where GLRLM — Grey-Level Run Length Matrix. Univariate logistic regression showed that SHAPE_Compacity[onlyFor3DROI] (R2=15%) and PARAMS_ZSpatialResampling (R2=18%) had a statistically significant effect on the outcome. First, using the multivariate logistic regression method, a prognostic model was built that takes into account only linear dependencies. The model includes 3 features that together have a statistically significant effect on the outcome (R2=23%): SHAPE_Sphericity[onlyFor3DROI], PARAMS_ZSpatialResampling and GLRLM_RLNU. To describe nonlinear relationships, another model was built based on the “Decision Tree” algorithm. It included 4 indicators (R2=58%): DISCRETIZED_HISTO_Entropy_log10 (the randomness of the distribution), SHAPE_Sphericity[onlyFor3DROI], PARAMS_ZSpatialResampling and GLRLM_SRE. Stacking of algorithms, which consists of calculating the arithmetic mean between the predictions of the multivariate logistic regression and “Decision Tree” algorithms, made it possible to construct a model that takes into account both linear and nonlinear dependencies. The model includes 5 features (R2=77%). The constructed model formed the basis of the developed calculator program [7], currently introduced into a radiology practice. CONCLUSION: The new model built on the basis of apparent diffusion coefficient maps performs better (area under ROC-curve 99.0% [97.7; 100.0]) than the existing models with area under ROC-curve 83.6% [78.3; 88.9], which also show high heterogeneity (I2=71%). The accuracy of the new model was increased due to the use of stacking machine learning technologies, which made it possible to take into account both linear and nonlinear effects from features on the outcome.
基于双参数磁共振成像表观扩散系数图放射组学和机器学习算法堆叠,开发前列腺癌诊断预后模型
背景:前列腺癌是男性最常见的癌症之一 [1,2]。近年来,一些基于双参数磁共振图像纹理分析的预后模型应运而生。研究表明,从表观扩散系数图中提取的放射组学特征的可重复性最高[3]。然而,这些模型的准确性有限,因为它们是用单一的机器学习算法建立的,只考虑了线性相关性[4-6]。目的:在双参数磁共振成像表观扩散系数图放射组学的基础上,使用堆叠式机器学习算法,不仅考虑线性依赖关系,还考虑非线性依赖关系,从而提高前列腺癌预后诊断模型的准确性。材料与方法:2017 年至 2023 年,联合医院和综合医院(俄罗斯莫斯科)X 射线诊断和断层扫描部对疑似前列腺癌患者进行了单中心队列回顾性研究。前列腺癌通过活检或根治性前列腺切除术确诊。统计分析使用 Python 3.11 进行。结果:研究涉及 67 名男性,年龄为 60 [54; 66] 岁,其中 57 人确诊为前列腺癌,10 人确诊为良性前列腺增生。LIFEx 软件确定了 96 个放射学特征。在以下方面发现了具有统计学意义的差异PARAMS_ZSpatialResampling(图像的体素大小:Z 维)(P=0.001)、SHAPE_Sphericity[onlyFor3DROI](感兴趣体的球形程度)(P=0.006)、SHAPE_Compacity[onlyFor3DROI](感兴趣体的紧凑程度)(p=0.004)、GLRLM_HGRE(p=0.039)、GLRLM_SRHGE(p=0.041)、GLRLM_RLNU(p=0.039),其中 GLRLM - 灰阶运行长度矩阵。单变量逻辑回归显示,SHAPE_Compacity[onlyFor3DROI] (R2=15%) 和 PARAMS_ZSpatialResampling (R2=18%) 对结果有显著的统计学影响。首先,利用多元逻辑回归方法,建立了一个仅考虑线性依赖关系的预后模型。该模型包括 3 个特征,这 3 个特征加在一起对预后的影响具有统计学意义(R2=23%):SHAPE_Sphericity[onlyFor3DROI]、PARAMS_ZSpatialResampling 和 GLRLM_RLNU。为了描述非线性关系,我们根据 "决策树 "算法建立了另一个模型。它包括 4 个指标(R2=58%):DISCRETIZED_HISTO_Entropy_log10(分布的随机性)、SHAPE_Sphericity[onlyFor3DROI]、PARAMS_ZSpatialResampling 和 GLRLM_SRE。算法堆叠(包括计算多元逻辑回归算法和 "决策树 "算法预测结果之间的算术平均值)使得构建一个同时考虑线性和非线性依赖关系的模型成为可能。该模型包括 5 个特征(R2=77%)。构建的模型是开发的计算器程序[7]的基础,该程序目前已被引入放射科实践中。结论:基于表观弥散系数图建立的新模型(ROC 曲线下面积为 99.0% [97.7; 100.0])优于现有模型(ROC 曲线下面积为 83.6% [78.3; 88.9]),后者也显示出高度异质性(I2=71%)。新模型的准确性提高得益于叠加式机器学习技术的使用,该技术可以同时考虑特征对结果的线性和非线性影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
1.30
自引率
0.00%
发文量
44
审稿时长
5 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信