CONSeg: Voxelwise Uncertainty Quantification for Glioma Segmentation Using Conformal Prediction.

AJNR. American journal of neuroradiology Pub Date : 2025-07-03 DOI:10.3174/ajnr.A8914

Danial Elyassirad, Benyamin Gheiji, Mahsa Vatanparast, Amir Mahmoud Ahmadzadeh, Shahriar Faghani

{"title":"CONSeg: Voxelwise Uncertainty Quantification for Glioma Segmentation Using Conformal Prediction.","authors":"Danial Elyassirad, Benyamin Gheiji, Mahsa Vatanparast, Amir Mahmoud Ahmadzadeh, Shahriar Faghani","doi":"10.3174/ajnr.A8914","DOIUrl":null,"url":null,"abstract":"Background and purpose: Accurate glioma segmentation has the potential to enhance clinical decision-making and treatment planning. Uncertainty quantification methods, including conformal prediction (CP), can enhance segmentation models reliability. CP quantifies uncertainty with statistical confidence guarantees. This study aims to use CP in glioma segmentation.Materials and methods: We used the publicly available UCSF and UPenn glioma datasets, with the UCSF dataset (495 cases) split into training (70%), validation (10%), calibration (10%), and test (10%) sets, and the UPenn dataset (147 cases) divided into external calibration (30%) and external test (70%) sets. A UNet model was trained, and its optimal threshold was set to 0.5 using prediction normalization. To apply CP, the conformal threshold was selected based on the internal/external calibration nonconformity score, and CP was subsequently applied to the internal/external test sets, with coverage -the proportion of true labels within prediction sets-reported for all. We defined the uncertainty ratio (UR) and assessed its correlation with the Dice score coefficient (DSC) and 95th percentile Hausdorff distance (HD95). Additionally, we categorized cases into certain and uncertain groups based on UR and compared their DSC and HD95. We also evaluate the correlation between UR and the evaluation metrics (DSC and HD95) of the BraTS fusion model segmentation (BFMS), and compare evaluation metrics in the certain and uncertain subgroups.Results: The base model achieved a DSC of 0.86 and 0.83, and an HD95 of 7.35 and 11.71 on the internal and external test sets, respectively. The CP coverage was 0.9982 for the internal test set and 0.9977 for the external test set. Statistical analysis showed significant correlations between UR and evaluation metrics for test sets (p values <0.001). Additionally, certain cases had significantly better evaluation metrics (higher DSC and lower HD95) than uncertain cases in test sets and the BFMS (p values <0.001).Conclusions: CP effectively quantifies uncertainty in glioma segmentation. Using CONSeg improves the reliability of segmentation models and enhances human-computer interaction. Additionally, CONSeg can identify uncertain cases and suggest them for manual segmentation.Abbreviations: CP = conformal prediction; UR = uncertainty ratio; DSC = Dice score coefficient; BFMS = BraTS fusion model segmentation; DL = deep learning; UQ = uncertainty quantification; BCE = binary cross-entropy; BMOT = base model optimal threshold; NCST = nonconformity score threshold; CONSeg = conformal segmentation; BMPN = base model prediction normalization.","PeriodicalId":93863,"journal":{"name":"AJNR. American journal of neuroradiology","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AJNR. American journal of neuroradiology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3174/ajnr.A8914","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Background and purpose: Accurate glioma segmentation has the potential to enhance clinical decision-making and treatment planning. Uncertainty quantification methods, including conformal prediction (CP), can enhance segmentation models reliability. CP quantifies uncertainty with statistical confidence guarantees. This study aims to use CP in glioma segmentation.

Materials and methods: We used the publicly available UCSF and UPenn glioma datasets, with the UCSF dataset (495 cases) split into training (70%), validation (10%), calibration (10%), and test (10%) sets, and the UPenn dataset (147 cases) divided into external calibration (30%) and external test (70%) sets. A UNet model was trained, and its optimal threshold was set to 0.5 using prediction normalization. To apply CP, the conformal threshold was selected based on the internal/external calibration nonconformity score, and CP was subsequently applied to the internal/external test sets, with coverage -the proportion of true labels within prediction sets-reported for all. We defined the uncertainty ratio (UR) and assessed its correlation with the Dice score coefficient (DSC) and 95th percentile Hausdorff distance (HD95). Additionally, we categorized cases into certain and uncertain groups based on UR and compared their DSC and HD95. We also evaluate the correlation between UR and the evaluation metrics (DSC and HD95) of the BraTS fusion model segmentation (BFMS), and compare evaluation metrics in the certain and uncertain subgroups.

Results: The base model achieved a DSC of 0.86 and 0.83, and an HD95 of 7.35 and 11.71 on the internal and external test sets, respectively. The CP coverage was 0.9982 for the internal test set and 0.9977 for the external test set. Statistical analysis showed significant correlations between UR and evaluation metrics for test sets (p values <0.001). Additionally, certain cases had significantly better evaluation metrics (higher DSC and lower HD95) than uncertain cases in test sets and the BFMS (p values <0.001).

Conclusions: CP effectively quantifies uncertainty in glioma segmentation. Using CONSeg improves the reliability of segmentation models and enhances human-computer interaction. Additionally, CONSeg can identify uncertain cases and suggest them for manual segmentation.

Abbreviations: CP = conformal prediction; UR = uncertainty ratio; DSC = Dice score coefficient; BFMS = BraTS fusion model segmentation; DL = deep learning; UQ = uncertainty quantification; BCE = binary cross-entropy; BMOT = base model optimal threshold; NCST = nonconformity score threshold; CONSeg = conformal segmentation; BMPN = base model prediction normalization.

查看原文本刊更多论文

使用适形预测的神经胶质瘤分割的体素不确定性量化。

背景与目的：准确的神经胶质瘤分割有可能提高临床决策和治疗计划。不确定性量化方法，包括保形预测（CP），可以提高分割模型的可靠性。CP用统计置信度保证来量化不确定性。本研究旨在利用CP进行胶质瘤分割。材料和方法：我们使用公开的UCSF和UPenn胶质瘤数据集，其中UCSF数据集（495例）分为训练集（70%）、验证集（10%）、校准集（10%）和测试集（10%），UPenn数据集（147例）分为外部校准集（30%）和外部测试集（70%）。对UNet模型进行训练，利用预测归一化将其最优阈值设置为0.5。为了应用CP，根据内部/外部校准不符合性评分选择适形阈值，然后将CP应用于内部/外部测试集，并报告所有预测集中的覆盖范围-真实标签的比例。我们定义了不确定比（UR），并评估了其与Dice评分系数（DSC）和第95百分位Hausdorff距离（HD95）的相关性。此外，我们根据UR将病例分为确定组和不确定组，并比较其DSC和HD95。我们还评估了UR与BraTS融合模型分割（BFMS）的评估指标（DSC和HD95）之间的相关性，并比较了确定和不确定亚组的评估指标。结果：基础模型在内部和外部测试集上的DSC分别为0.86和0.83，HD95分别为7.35和11.71。内部测试集的CP覆盖率为0.9982，外部测试集的CP覆盖率为0.9977。统计分析显示，测试集的UR与评估指标之间存在显著相关性（p值）。结论：CP有效地量化了胶质瘤分割的不确定性。使用cong提高了分割模型的可靠性，增强了人机交互。此外，cong可以识别不确定的情况，并建议他们进行手动分割。缩写：CP =适形预测；UR =不确定比；DSC =骰子得分系数；BFMS = BraTS融合模型分割；DL =深度学习；UQ =不确定度量化；二元交叉熵；BMOT =基模型最优阈值；不合格评分阈值；cong =保形分割；基本模型预测归一化。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

AJNR. American journal of neuroradiology

自引率

0.00%

发文量