Federica C. Maruccio , Rita Simões , Joëlle E. van Aalst , Charlotte L. Brouwer , Jan-Jakob Sonke , Peter van Ooijen , Tomas M. Janssen
{"title":"Leveraging network uncertainty to identify regions in rectal cancer clinical target volume auto-segmentations likely requiring manual edits","authors":"Federica C. Maruccio , Rita Simões , Joëlle E. van Aalst , Charlotte L. Brouwer , Jan-Jakob Sonke , Peter van Ooijen , Tomas M. Janssen","doi":"10.1016/j.phro.2025.100771","DOIUrl":null,"url":null,"abstract":"<div><h3>Background and Purpose</h3><div>While Deep Learning (DL) auto-segmentation has the potential to improve segmentation efficiency in the radiotherapy workflow, manual adjustments of the predictions are still required. Network uncertainty quantification has been proposed as a quality assurance tool to ensure an efficient segmentation workflow. However, the interpretation is often complicated due to various sources of uncertainty interacting non-trivially. In this work, we compared network predictions with both independent manual segmentations and manual corrections of the predictions. We assume that manual corrections only address clinically relevant errors and are therefore associated with lower aleatoric uncertainty due to less inter-observer variability. We expect the remaining epistemic uncertainty to be a better predictor of segmentation corrections.</div></div><div><h3>Materials and Methods</h3><div>We considered DL auto-segmentations of the mesorectum clinical target volume. Uncertainty maps of nnU-Net outputs were generated using Monte Carlo dropout. On a global level, we investigated the correlation between mean network uncertainty and network segmentation performance. On a local level, we compared the uncertainty envelope width with the length of the error from both independent contours and corrected predictions. The uncertainty envelope widths were used to classify the error lengths as above or below a predefined threshold.</div></div><div><h3>Results</h3><div>We achieved an AUC above 0.9 in identifying regions manually corrected with edits larger than 8 mm, while the AUC for inconsistencies with the independent contours was significantly lower at approximately 0.7.</div></div><div><h3>Conclusions</h3><div>Our results validate the hypothesis that epistemic uncertainty estimates are a valuable tool to capture regions likely requiring clinically relevant edits.</div></div>","PeriodicalId":36850,"journal":{"name":"Physics and Imaging in Radiation Oncology","volume":"34 ","pages":"Article 100771"},"PeriodicalIF":3.4000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physics and Imaging in Radiation Oncology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2405631625000764","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background and Purpose
While Deep Learning (DL) auto-segmentation has the potential to improve segmentation efficiency in the radiotherapy workflow, manual adjustments of the predictions are still required. Network uncertainty quantification has been proposed as a quality assurance tool to ensure an efficient segmentation workflow. However, the interpretation is often complicated due to various sources of uncertainty interacting non-trivially. In this work, we compared network predictions with both independent manual segmentations and manual corrections of the predictions. We assume that manual corrections only address clinically relevant errors and are therefore associated with lower aleatoric uncertainty due to less inter-observer variability. We expect the remaining epistemic uncertainty to be a better predictor of segmentation corrections.
Materials and Methods
We considered DL auto-segmentations of the mesorectum clinical target volume. Uncertainty maps of nnU-Net outputs were generated using Monte Carlo dropout. On a global level, we investigated the correlation between mean network uncertainty and network segmentation performance. On a local level, we compared the uncertainty envelope width with the length of the error from both independent contours and corrected predictions. The uncertainty envelope widths were used to classify the error lengths as above or below a predefined threshold.
Results
We achieved an AUC above 0.9 in identifying regions manually corrected with edits larger than 8 mm, while the AUC for inconsistencies with the independent contours was significantly lower at approximately 0.7.
Conclusions
Our results validate the hypothesis that epistemic uncertainty estimates are a valuable tool to capture regions likely requiring clinically relevant edits.