Uncertainty-aware deep learning for segmentation of primary tumor and pathologic lymph nodes in oropharyngeal cancer: Insights from a multi-center cohort

IF 5.4 2区 医学 Q1 ENGINEERING, BIOMEDICAL
Alessia De Biase , Nanna Maria Sijtsema , Lisanne V. van Dijk , Roel Steenbakkers , Johannes A. Langendijk , Peter van Ooijen
{"title":"Uncertainty-aware deep learning for segmentation of primary tumor and pathologic lymph nodes in oropharyngeal cancer: Insights from a multi-center cohort","authors":"Alessia De Biase ,&nbsp;Nanna Maria Sijtsema ,&nbsp;Lisanne V. van Dijk ,&nbsp;Roel Steenbakkers ,&nbsp;Johannes A. Langendijk ,&nbsp;Peter van Ooijen","doi":"10.1016/j.compmedimag.2025.102535","DOIUrl":null,"url":null,"abstract":"<div><h3>Purpose</h3><div>Information on deep learning (DL) tumor segmentation accuracy on a voxel and a structure level is essential for clinical introduction. In a previous study, a DL model was developed for oropharyngeal cancer (OPC) primary tumor (PT) segmentation in PET/CT images and voxel-level predicted probabilities (TPM) quantifying model certainty were introduced. This study extended the network to simultaneously generate TPMs for PT and pathologic lymph nodes (PL) and explored whether structure-level uncertainty in TPMs predicts segmentation model accuracy in an independent external cohort.</div></div><div><h3>Methods</h3><div>We retrospectively gathered PET/CT images and manual delineations of gross tumor volume of the PT (GTVp) and PL (GTVln) of 407 OPC patients treated with (chemo)radiation in our institute. The HECKTOR 2022 challenge dataset served as external test set. The pre-existing architecture was modified for multi-label segmentation. Multiple models were trained, and the non-binarized ensemble average of TPMs was considered per patient. Segmentation accuracy was quantified by surface and aggregate DSC, model uncertainty by coefficient of variation (CV) of multiple predictions.</div></div><div><h3>Results</h3><div>Predicted GTVp and GTVln segmentations in the external test achieved 0.75 and 0.70 aggregate DSC. Patient-specific CV and surface DSC showed a significant correlation for both structures (-0.54 and −0.66 for GTVp and GTVln) in the external set, indicating significant calibration.</div></div><div><h3>Conclusion</h3><div>Significant accuracy versus uncertainty calibration was achieved for TPMs in both internal and external test sets, indicating the potential use of quantified uncertainty from TPMs to identify cases with lower GTVp and GTVln segmentation accuracy, independently of the dataset.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"123 ","pages":"Article 102535"},"PeriodicalIF":5.4000,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computerized Medical Imaging and Graphics","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0895611125000448","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0

Abstract

Purpose

Information on deep learning (DL) tumor segmentation accuracy on a voxel and a structure level is essential for clinical introduction. In a previous study, a DL model was developed for oropharyngeal cancer (OPC) primary tumor (PT) segmentation in PET/CT images and voxel-level predicted probabilities (TPM) quantifying model certainty were introduced. This study extended the network to simultaneously generate TPMs for PT and pathologic lymph nodes (PL) and explored whether structure-level uncertainty in TPMs predicts segmentation model accuracy in an independent external cohort.

Methods

We retrospectively gathered PET/CT images and manual delineations of gross tumor volume of the PT (GTVp) and PL (GTVln) of 407 OPC patients treated with (chemo)radiation in our institute. The HECKTOR 2022 challenge dataset served as external test set. The pre-existing architecture was modified for multi-label segmentation. Multiple models were trained, and the non-binarized ensemble average of TPMs was considered per patient. Segmentation accuracy was quantified by surface and aggregate DSC, model uncertainty by coefficient of variation (CV) of multiple predictions.

Results

Predicted GTVp and GTVln segmentations in the external test achieved 0.75 and 0.70 aggregate DSC. Patient-specific CV and surface DSC showed a significant correlation for both structures (-0.54 and −0.66 for GTVp and GTVln) in the external set, indicating significant calibration.

Conclusion

Significant accuracy versus uncertainty calibration was achieved for TPMs in both internal and external test sets, indicating the potential use of quantified uncertainty from TPMs to identify cases with lower GTVp and GTVln segmentation accuracy, independently of the dataset.
不确定性感知深度学习用于口咽癌原发肿瘤和病理淋巴结的分割:来自多中心队列的见解
目的研究深度学习在体素和结构水平上的肿瘤分割精度对临床应用有重要意义。在之前的研究中,建立了用于PET/CT图像中口咽癌(OPC)原发肿瘤(PT)分割的DL模型,并引入了体素级预测概率(TPM)量化模型确定性。本研究将该网络扩展到同时生成PT和病理淋巴结(PL)的TPMs,并在一个独立的外部队列中探讨TPMs的结构水平不确定性是否能预测分割模型的准确性。方法回顾性收集我院407例OPC(化疗)放疗患者的PET/CT图像和人工划定的PT (GTVp)和PL (GTVln)总肿瘤体积。HECKTOR 2022挑战数据集作为外部测试集。针对多标签分割,对原有架构进行了修改。训练多个模型,并考虑每位患者TPMs的非二值化集合平均值。分割精度由表面和总体DSC量化,模型不确定性由多个预测的变异系数(CV)量化。结果预测的GTVp和GTVln分割在外部测试中分别达到0.75和0.70的总DSC。患者特异性CV和表面DSC在外部组中显示出两种结构的显著相关性(GTVp和GTVln分别为-0.54和- 0.66),表明有意义的校准。结论TPMs在内部和外部测试集中均获得了显著的准确性与不确定度校准,表明TPMs的量化不确定度可用于识别GTVp和GTVln分割精度较低的情况,而不依赖于数据集。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
10.70
自引率
3.50%
发文量
71
审稿时长
26 days
期刊介绍: The purpose of the journal Computerized Medical Imaging and Graphics is to act as a source for the exchange of research results concerning algorithmic advances, development, and application of digital imaging in disease detection, diagnosis, intervention, prevention, precision medicine, and population health. Included in the journal will be articles on novel computerized imaging or visualization techniques, including artificial intelligence and machine learning, augmented reality for surgical planning and guidance, big biomedical data visualization, computer-aided diagnosis, computerized-robotic surgery, image-guided therapy, imaging scanning and reconstruction, mobile and tele-imaging, radiomics, and imaging integration and modeling with other information relevant to digital health. The types of biomedical imaging include: magnetic resonance, computed tomography, ultrasound, nuclear medicine, X-ray, microwave, optical and multi-photon microscopy, video and sensory imaging, and the convergence of biomedical images with other non-imaging datasets.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信