Uncertainty quantification for deep learning-based metastatic lesion segmentation on whole body PET/CT.

IF 3.3 3区 医学 Q2 ENGINEERING, BIOMEDICAL
Brayden Schott, Victor Santoro-Fernandes, Žan Klaneček, Scott Perlman, Robert Jeraj
{"title":"Uncertainty quantification for deep learning-based metastatic lesion segmentation on whole body PET/CT.","authors":"Brayden Schott, Victor Santoro-Fernandes, Žan Klaneček, Scott Perlman, Robert Jeraj","doi":"10.1088/1361-6560/add9df","DOIUrl":null,"url":null,"abstract":"<p><p><i>Objective.</i>Deep learning models are increasingly being implemented for automated medical image analysis to inform patient care. Most models, however, lack uncertainty information, without which the reliability of model outputs cannot be ensured. Several uncertainty quantification (UQ) methods exist to capture model uncertainty. Yet, it is not clear which method is optimal for a given task. The purpose of this work was to investigate several commonly used UQ methods for the critical yet understudied task of metastatic lesion segmentation on whole body PET/CT.<i>Approach.</i>59 whole body<sup>68</sup>Ga-DOTATATE PET/CT images of patients undergoing theranostic treatment of metastatic neuroendocrine tumors were used in this work. A 3D U-Net was trained for lesion segmentation following five-fold cross validation. Uncertainty measures derived from four UQ methods-probability entropy, Monte Carlo dropout, deep ensembles, and test time augmentation-were investigated. Each uncertainty measure was assessed across four quantitative evaluations: (1) its ability to detect artificially degraded image data at low, medium, and high degradation magnitudes; (2) to detect false-positive (FP) predicted regions; (3) to recover false-negative (FN) predicted regions; and (4) to establish correlations with model biomarker extraction and segmentation performance metrics.<i>Main</i><i>results.</i>Test time augmentation and probability entropy respectively achieved the highest and lowest degraded image detection at low (AUC = 0.54 vs. 0.68), medium (AUC = 0.70 vs. 0.82), and high (AUC = 0.83 vs. 0.90) degradation magnitudes. For detecting FPs, all UQ methods achieve strong performance, with AUC values ranging narrowly between 0.77 and 0.81. FN region recovery performance was strongest for test time augmentation and weakest for probability entropy. Performance for the correlation analysis was mixed, where the strongest performance was achieved by test time augmentation for SUV<sub>total</sub>capture (ρ= 0.57) and segmentation Dice coefficient (ρ= 0.72), by Monte Carlo dropout for SUV<sub>mean</sub>capture (ρ= 0.35), and by probability entropy for segmentation cross entropy (ρ= 0.96).<i>Significance.</i>Overall, test time augmentation demonstrated superior UQ performance and is recommended for use in metastatic lesion segmentation task. It also offers the advantage of being post hoc and computationally efficient. In contrast, probability entropy performed the worst, highlighting the need for advanced UQ approaches for this task.</p>","PeriodicalId":20185,"journal":{"name":"Physics in medicine and biology","volume":" ","pages":""},"PeriodicalIF":3.3000,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physics in medicine and biology","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1088/1361-6560/add9df","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0

Abstract

Objective.Deep learning models are increasingly being implemented for automated medical image analysis to inform patient care. Most models, however, lack uncertainty information, without which the reliability of model outputs cannot be ensured. Several uncertainty quantification (UQ) methods exist to capture model uncertainty. Yet, it is not clear which method is optimal for a given task. The purpose of this work was to investigate several commonly used UQ methods for the critical yet understudied task of metastatic lesion segmentation on whole body PET/CT.Approach.59 whole body68Ga-DOTATATE PET/CT images of patients undergoing theranostic treatment of metastatic neuroendocrine tumors were used in this work. A 3D U-Net was trained for lesion segmentation following five-fold cross validation. Uncertainty measures derived from four UQ methods-probability entropy, Monte Carlo dropout, deep ensembles, and test time augmentation-were investigated. Each uncertainty measure was assessed across four quantitative evaluations: (1) its ability to detect artificially degraded image data at low, medium, and high degradation magnitudes; (2) to detect false-positive (FP) predicted regions; (3) to recover false-negative (FN) predicted regions; and (4) to establish correlations with model biomarker extraction and segmentation performance metrics.Mainresults.Test time augmentation and probability entropy respectively achieved the highest and lowest degraded image detection at low (AUC = 0.54 vs. 0.68), medium (AUC = 0.70 vs. 0.82), and high (AUC = 0.83 vs. 0.90) degradation magnitudes. For detecting FPs, all UQ methods achieve strong performance, with AUC values ranging narrowly between 0.77 and 0.81. FN region recovery performance was strongest for test time augmentation and weakest for probability entropy. Performance for the correlation analysis was mixed, where the strongest performance was achieved by test time augmentation for SUVtotalcapture (ρ= 0.57) and segmentation Dice coefficient (ρ= 0.72), by Monte Carlo dropout for SUVmeancapture (ρ= 0.35), and by probability entropy for segmentation cross entropy (ρ= 0.96).Significance.Overall, test time augmentation demonstrated superior UQ performance and is recommended for use in metastatic lesion segmentation task. It also offers the advantage of being post hoc and computationally efficient. In contrast, probability entropy performed the worst, highlighting the need for advanced UQ approaches for this task.

基于深度学习的全身PET/CT转移病灶分割的不确定性量化。
目的:深度学习模型越来越多地应用于自动化医学图像分析,为患者护理提供信息。然而,大多数模型缺乏不确定性信息,没有不确定性信息就无法保证模型输出的可靠性。存在几种不确定性量化(UQ)方法来捕获模型的不确定性。然而,对于给定的任务,哪一种方法是最佳的尚不清楚。这项工作的目的是研究几种常用的UQ方法,用于在全身PET/CT上进行转移病灶分割的关键但尚未得到充分研究的任务。方法:本研究采用转移性神经内分泌肿瘤接受治疗的患者全身68Ga-DOTATATE PET/CT图像59张。在五次交叉验证后,训练三维U-Net进行病灶分割。研究了四种UQ方法——概率熵、蒙特卡罗退出、深度集成和测试时间增加——得出的不确定性测量。每个不确定度测量通过四个定量评估进行评估:(1)其检测低、中、高退化程度的人为退化图像数据的能力;(2)检测假阳性(FP)预测区域;(3)恢复假阴性(FN)预测区域;(3)建立与模型生物标志物提取和分割性能指标的相关性。结果:测试时间增强和概率熵分别在低(AUC=0.54 vs. 0.68)、中(AUC=0.70 vs. 0.82)和高(AUC=0.83 vs. 0.90)退化程度下实现了最高和最低的退化图像检测。对于FPs的检测,所有UQ方法都取得了较好的性能,AUC值在0.77 ~ 0.81之间。FN区域恢复性能在测试时间增强时最强,在概率熵增强时最弱。相关性分析的性能好坏不等,其中最强的性能是通过增加测试时间获得的SUVtotal捕获(ρ=0.57)和分割Dice系数(ρ=0.72),通过蒙特卡罗dropout获得SUVmean捕获(ρ=0.35),以及通过概率熵获得分割交叉熵(ρ=0.96)。意义:总体而言,增加测试时间显示出优越的不确定性量化性能,推荐用于转移病灶分割任务。它还具有事后处理和计算效率高的优点。相比之下,概率熵的表现最差,这突出表明需要先进的UQ方法来完成这项任务。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Physics in medicine and biology
Physics in medicine and biology 医学-工程:生物医学
CiteScore
6.50
自引率
14.30%
发文量
409
审稿时长
2 months
期刊介绍: The development and application of theoretical, computational and experimental physics to medicine, physiology and biology. Topics covered are: therapy physics (including ionizing and non-ionizing radiation); biomedical imaging (e.g. x-ray, magnetic resonance, ultrasound, optical and nuclear imaging); image-guided interventions; image reconstruction and analysis (including kinetic modelling); artificial intelligence in biomedical physics and analysis; nanoparticles in imaging and therapy; radiobiology; radiation protection and patient dose monitoring; radiation dosimetry
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信