{"title":"A time-dependent explainable radiomic analysis from the multi-omic cohort of CPTAC-Pancreatic Ductal Adenocarcinoma","authors":"","doi":"10.1016/j.cmpb.2024.108408","DOIUrl":null,"url":null,"abstract":"<div><h3>Background and Objective</h3><div>In Pancreatic Ductal Adenocarcinoma (PDA), multi-omic models are emerging to answer unmet clinical needs to derive novel quantitative prognostic factors. We realized a pipeline that relies on survival machine-learning (SML) classifiers and explainability based on patients’ follow-up (FU) to stratify prognosis from the public-available multi-omic datasets of the CPTAC-PDA project.</div></div><div><h3>Materials and Methods</h3><div>Analyzed datasets included tumor-annotated radiologic images, clinical, and mutational data. A feature selection was based on univariate (UV) and multivariate (MV) survival analyses according to Overall Survival (OS) and recurrence (REC). In this study, we considered seven multi-omic datasets and compared four SML classifiers: Cox, survival random forest, generalized boosted, and support vector machines (SVM). For each classifier, we assessed the concordance (C) index on the validation set. The best classifiers for the validation set on both OS and REC underwent explainability analyses using SurvSHAP(t), which extends SHapley Additive exPlanations (SHAP).</div></div><div><h3>Results</h3><div>According to OS, after UV and MV analyses we selected 18/37 and 10/37 multi-omic features, respectively. According to REC, based on UV and MV analyses we selected 10/35 and 5/35 determinants, respectively. Generally, SML classifiers including radiomics outperformed those modelled on clinical or mutational predictors. For OS, the Cox model encompassing radiomic, clinical, and mutational features reached 75 % of C index, outperforming other classifiers. On the other hand, for REC, the SVM model including only radiomics emerged as the best-performing, with 68 % of C index. For OS, SurvSHAP(t) identified the first order Median Gray Level (GL) intensities, the gender, the tumor grade, the Joint Energy GL Co-occurrence Matrix (GLCM), and the GLCM Informational Measures of Correlations of type 1 as the most important features. For REC, the first order Median GL intensities, the GL size zone matrix Small Area Low GL Emphasis, and first order variance of GL intensities emerged as the most discriminative.</div></div><div><h3>Conclusions</h3><div>In this work, radiomics showed the potential for improving patients’ risk stratification in PDA. Furthermore, a deeper understanding of how radiomics can contribute to prognosis in PDA was achieved with a time-dependent explainability of the top multi-omic predictors.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":null,"pages":null},"PeriodicalIF":4.9000,"publicationDate":"2024-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer methods and programs in biomedicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169260724004012","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Background and Objective
In Pancreatic Ductal Adenocarcinoma (PDA), multi-omic models are emerging to answer unmet clinical needs to derive novel quantitative prognostic factors. We realized a pipeline that relies on survival machine-learning (SML) classifiers and explainability based on patients’ follow-up (FU) to stratify prognosis from the public-available multi-omic datasets of the CPTAC-PDA project.
Materials and Methods
Analyzed datasets included tumor-annotated radiologic images, clinical, and mutational data. A feature selection was based on univariate (UV) and multivariate (MV) survival analyses according to Overall Survival (OS) and recurrence (REC). In this study, we considered seven multi-omic datasets and compared four SML classifiers: Cox, survival random forest, generalized boosted, and support vector machines (SVM). For each classifier, we assessed the concordance (C) index on the validation set. The best classifiers for the validation set on both OS and REC underwent explainability analyses using SurvSHAP(t), which extends SHapley Additive exPlanations (SHAP).
Results
According to OS, after UV and MV analyses we selected 18/37 and 10/37 multi-omic features, respectively. According to REC, based on UV and MV analyses we selected 10/35 and 5/35 determinants, respectively. Generally, SML classifiers including radiomics outperformed those modelled on clinical or mutational predictors. For OS, the Cox model encompassing radiomic, clinical, and mutational features reached 75 % of C index, outperforming other classifiers. On the other hand, for REC, the SVM model including only radiomics emerged as the best-performing, with 68 % of C index. For OS, SurvSHAP(t) identified the first order Median Gray Level (GL) intensities, the gender, the tumor grade, the Joint Energy GL Co-occurrence Matrix (GLCM), and the GLCM Informational Measures of Correlations of type 1 as the most important features. For REC, the first order Median GL intensities, the GL size zone matrix Small Area Low GL Emphasis, and first order variance of GL intensities emerged as the most discriminative.
Conclusions
In this work, radiomics showed the potential for improving patients’ risk stratification in PDA. Furthermore, a deeper understanding of how radiomics can contribute to prognosis in PDA was achieved with a time-dependent explainability of the top multi-omic predictors.
期刊介绍:
To encourage the development of formal computing methods, and their application in biomedical research and medical practice, by illustration of fundamental principles in biomedical informatics research; to stimulate basic research into application software design; to report the state of research of biomedical information processing projects; to report new computer methodologies applied in biomedical areas; the eventual distribution of demonstrable software to avoid duplication of effort; to provide a forum for discussion and improvement of existing software; to optimize contact between national organizations and regional user groups by promoting an international exchange of information on formal methods, standards and software in biomedicine.
Computer Methods and Programs in Biomedicine covers computing methodology and software systems derived from computing science for implementation in all aspects of biomedical research and medical practice. It is designed to serve: biochemists; biologists; geneticists; immunologists; neuroscientists; pharmacologists; toxicologists; clinicians; epidemiologists; psychiatrists; psychologists; cardiologists; chemists; (radio)physicists; computer scientists; programmers and systems analysts; biomedical, clinical, electrical and other engineers; teachers of medical informatics and users of educational software.