Impact of Scanner Manufacturer, Endorectal Coil Use, and Clinical Variables on Deep Learning-assisted Prostate Cancer Classification Using Multiparametric MRI.
IF 8.1 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
José Guilherme de Almeida, Nuno M Rodrigues, Ana Sofia Castro Verde, Ana Mascarenhas Gaivão, Carlos Bilreiro, Inês Santiago, Joana Ip, Sara Belião, Celso Matos, Sara Silva, Manolis Tsiknakis, Kostantinos Marias, Daniele Regge, Nikolaos Papanikolaou
{"title":"Impact of Scanner Manufacturer, Endorectal Coil Use, and Clinical Variables on Deep Learning-assisted Prostate Cancer Classification Using Multiparametric MRI.","authors":"José Guilherme de Almeida, Nuno M Rodrigues, Ana Sofia Castro Verde, Ana Mascarenhas Gaivão, Carlos Bilreiro, Inês Santiago, Joana Ip, Sara Belião, Celso Matos, Sara Silva, Manolis Tsiknakis, Kostantinos Marias, Daniele Regge, Nikolaos Papanikolaou","doi":"10.1148/ryai.230555","DOIUrl":null,"url":null,"abstract":"<p><p>Purpose To assess the effect of scanner manufacturer and scanning protocol on the performance of deep learning models to classify aggressiveness of prostate cancer (PCa) at biparametric MRI (bpMRI). Materials and Methods In this retrospective study, 5478 cases from ProstateNet, a PCa bpMRI dataset with examinations from 13 centers, were used to develop five deep learning (DL) models to predict PCa aggressiveness with minimal lesion information and test how using data from different subgroups-scanner manufacturers and endorectal coil (ERC) use (Siemens, Philips, GE with and without ERC, and the full dataset)-affects model performance. Performance was assessed using the area under the receiver operating characteristic curve (AUC). The effect of clinical features (age, prostate-specific antigen level, Prostate Imaging Reporting and Data System score) on model performance was also evaluated. Results DL models were trained on 4328 bpMRI cases, and the best model achieved an AUC of 0.73 when trained and tested using data from all manufacturers. Held-out test set performance was higher when models trained with data from a manufacturer were tested on the same manufacturer (within- and between-manufacturer AUC differences of 0.05 on average, <i>P</i> < .001). The addition of clinical features did not improve performance (<i>P</i> = .24). Learning curve analyses showed that performance remained stable as training data increased. Analysis of DL features showed that scanner manufacturer and scanning protocol heavily influenced feature distributions. Conclusion In automated classification of PCa aggressiveness using bpMRI data, scanner manufacturer and ERC use had a major effect on DL model performance and features. <b>Keywords:</b> Convolutional Neural Network (CNN), Computer-aided Diagnosis (CAD), Computer Applications-General (Informatics), Oncology <i>Supplemental material is available for this article.</i> Published under a CC BY 4.0 license. See also commentary by Suri and Hsu in this issue.</p>","PeriodicalId":29787,"journal":{"name":"Radiology-Artificial Intelligence","volume":" ","pages":"e230555"},"PeriodicalIF":8.1000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Radiology-Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1148/ryai.230555","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose To assess the effect of scanner manufacturer and scanning protocol on the performance of deep learning models to classify aggressiveness of prostate cancer (PCa) at biparametric MRI (bpMRI). Materials and Methods In this retrospective study, 5478 cases from ProstateNet, a PCa bpMRI dataset with examinations from 13 centers, were used to develop five deep learning (DL) models to predict PCa aggressiveness with minimal lesion information and test how using data from different subgroups-scanner manufacturers and endorectal coil (ERC) use (Siemens, Philips, GE with and without ERC, and the full dataset)-affects model performance. Performance was assessed using the area under the receiver operating characteristic curve (AUC). The effect of clinical features (age, prostate-specific antigen level, Prostate Imaging Reporting and Data System score) on model performance was also evaluated. Results DL models were trained on 4328 bpMRI cases, and the best model achieved an AUC of 0.73 when trained and tested using data from all manufacturers. Held-out test set performance was higher when models trained with data from a manufacturer were tested on the same manufacturer (within- and between-manufacturer AUC differences of 0.05 on average, P < .001). The addition of clinical features did not improve performance (P = .24). Learning curve analyses showed that performance remained stable as training data increased. Analysis of DL features showed that scanner manufacturer and scanning protocol heavily influenced feature distributions. Conclusion In automated classification of PCa aggressiveness using bpMRI data, scanner manufacturer and ERC use had a major effect on DL model performance and features. Keywords: Convolutional Neural Network (CNN), Computer-aided Diagnosis (CAD), Computer Applications-General (Informatics), Oncology Supplemental material is available for this article. Published under a CC BY 4.0 license. See also commentary by Suri and Hsu in this issue.
期刊介绍:
Radiology: Artificial Intelligence is a bi-monthly publication that focuses on the emerging applications of machine learning and artificial intelligence in the field of imaging across various disciplines. This journal is available online and accepts multiple manuscript types, including Original Research, Technical Developments, Data Resources, Review articles, Editorials, Letters to the Editor and Replies, Special Reports, and AI in Brief.