Stress testing deep learning models for prostate cancer detection on biopsies and surgical specimens.

IF 5.6 2区 医学 Q1 ONCOLOGY
Brennan T Flannery, Howard M Sandler, Priti Lal, Michael D Feldman, Juan C Santa-Rosario, Tilak Pathak, Tuomas Mirtti, Xavier Farre, Rohann Correa, Susan Chafe, Amit Shah, Jason A Efstathiou, Karen Hoffman, Mark A Hallman, Michael Straza, Richard Jordan, Stephanie L Pugh, Felix Feng, Anant Madabhushi
{"title":"Stress testing deep learning models for prostate cancer detection on biopsies and surgical specimens.","authors":"Brennan T Flannery, Howard M Sandler, Priti Lal, Michael D Feldman, Juan C Santa-Rosario, Tilak Pathak, Tuomas Mirtti, Xavier Farre, Rohann Correa, Susan Chafe, Amit Shah, Jason A Efstathiou, Karen Hoffman, Mark A Hallman, Michael Straza, Richard Jordan, Stephanie L Pugh, Felix Feng, Anant Madabhushi","doi":"10.1002/path.6373","DOIUrl":null,"url":null,"abstract":"<p><p>The presence, location, and extent of prostate cancer is assessed by pathologists using H&E-stained tissue slides. Machine learning approaches can accomplish these tasks for both biopsies and radical prostatectomies. Deep learning approaches using convolutional neural networks (CNNs) have been shown to identify cancer in pathologic slides, some securing regulatory approval for clinical use. However, differences in sample processing can subtly alter the morphology between sample types, making it unclear whether deep learning algorithms will consistently work on both types of slide images. Our goal was to investigate whether morphological differences between sample types affected the performance of biopsy-trained cancer detection CNN models when applied to radical prostatectomies and vice versa using multiple cohorts (N = 1,000). Radical prostatectomies (N = 100) and biopsies (N = 50) were acquired from The University of Pennsylvania to train (80%) and validate (20%) a DenseNet CNN for biopsies (M<sup>B</sup>), radical prostatectomies (M<sup>R</sup>), and a combined dataset (M<sup>B+R</sup>). On a tile level, M<sup>B</sup> and M<sup>R</sup> achieved F1 scores greater than 0.88 when applied to their own sample type but less than 0.65 when applied across sample types. On a whole-slide level, models achieved significantly better performance on their own sample type compared to the alternative model (p < 0.05) for all metrics. This was confirmed by external validation using digitized biopsy slide images from a clinical trial [NRG Radiation Therapy Oncology Group (RTOG)] (NRG/RTOG 0521, N = 750) via both qualitative and quantitative analyses (p < 0.05). A comprehensive review of model outputs revealed morphologically driven decision making that adversely affected model performance. M<sup>B</sup> appeared to be challenged with the analysis of open gland structures, whereas M<sup>R</sup> appeared to be challenged with closed gland structures, indicating potential morphological variation between the training sets. These findings suggest that differences in morphology and heterogeneity necessitate the need for more tailored, sample-specific (i.e. biopsy and surgical) machine learning models. © 2024 The Author(s). The Journal of Pathology published by John Wiley & Sons Ltd on behalf of The Pathological Society of Great Britain and Ireland.</p>","PeriodicalId":232,"journal":{"name":"The Journal of Pathology","volume":" ","pages":""},"PeriodicalIF":5.6000,"publicationDate":"2024-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Journal of Pathology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/path.6373","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

The presence, location, and extent of prostate cancer is assessed by pathologists using H&E-stained tissue slides. Machine learning approaches can accomplish these tasks for both biopsies and radical prostatectomies. Deep learning approaches using convolutional neural networks (CNNs) have been shown to identify cancer in pathologic slides, some securing regulatory approval for clinical use. However, differences in sample processing can subtly alter the morphology between sample types, making it unclear whether deep learning algorithms will consistently work on both types of slide images. Our goal was to investigate whether morphological differences between sample types affected the performance of biopsy-trained cancer detection CNN models when applied to radical prostatectomies and vice versa using multiple cohorts (N = 1,000). Radical prostatectomies (N = 100) and biopsies (N = 50) were acquired from The University of Pennsylvania to train (80%) and validate (20%) a DenseNet CNN for biopsies (MB), radical prostatectomies (MR), and a combined dataset (MB+R). On a tile level, MB and MR achieved F1 scores greater than 0.88 when applied to their own sample type but less than 0.65 when applied across sample types. On a whole-slide level, models achieved significantly better performance on their own sample type compared to the alternative model (p < 0.05) for all metrics. This was confirmed by external validation using digitized biopsy slide images from a clinical trial [NRG Radiation Therapy Oncology Group (RTOG)] (NRG/RTOG 0521, N = 750) via both qualitative and quantitative analyses (p < 0.05). A comprehensive review of model outputs revealed morphologically driven decision making that adversely affected model performance. MB appeared to be challenged with the analysis of open gland structures, whereas MR appeared to be challenged with closed gland structures, indicating potential morphological variation between the training sets. These findings suggest that differences in morphology and heterogeneity necessitate the need for more tailored, sample-specific (i.e. biopsy and surgical) machine learning models. © 2024 The Author(s). The Journal of Pathology published by John Wiley & Sons Ltd on behalf of The Pathological Society of Great Britain and Ireland.

求助全文
约1分钟内获得全文 求助全文
来源期刊
The Journal of Pathology
The Journal of Pathology 医学-病理学
CiteScore
14.10
自引率
1.40%
发文量
144
审稿时长
3-8 weeks
期刊介绍: The Journal of Pathology aims to serve as a translational bridge between basic biomedical science and clinical medicine with particular emphasis on, but not restricted to, tissue based studies. The main interests of the Journal lie in publishing studies that further our understanding the pathophysiological and pathogenetic mechanisms of human disease. The Journal of Pathology welcomes investigative studies on human tissues, in vitro and in vivo experimental studies, and investigations based on animal models with a clear relevance to human disease, including transgenic systems. As well as original research papers, the Journal seeks to provide rapid publication in a variety of other formats, including editorials, review articles, commentaries and perspectives and other features, both contributed and solicited.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信