CaliDiff: Multi-rater annotation calibrating diffusion probabilistic model towards medical image segmentation

IF 11.8 1区医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Medical image analysis Pub Date : 2025-09-23 DOI:10.1016/j.media.2025.103812

Junxia Wang , Jing Wang , Jun Ma , Baijing Chen , Zeyuan Chen , Yuanjie Zheng

{"title":"CaliDiff: Multi-rater annotation calibrating diffusion probabilistic model towards medical image segmentation","authors":"Junxia Wang , Jing Wang , Jun Ma , Baijing Chen , Zeyuan Chen , Yuanjie Zheng","doi":"10.1016/j.media.2025.103812","DOIUrl":null,"url":null,"abstract":"<div><div>Medical image segmentation is critical for accurate diagnostics and effective treatment planning. Traditional multi-rater labeling strategies, while integrating consensus from multiple experts, often do not fully capture the unique insights of individual raters. Moreover, deep discriminative models that aggregate such expert labels typically embed inherent biases into the segmentation results. To address these issues, we introduce CaliDiff, a novel multi-rater annotation calibration diffusion probabilistic model. This model effectively approximates the joint probability distribution among multiple expert annotations and their corresponding images, fully leveraging diverse expert knowledge while actively refining these annotations to approximate the true underlying distribution closely. CaliDiff operates through a structured multi-stage process: it begins with a shared-parameter inverse diffusion to normalize initial expert biases, followed by Expertness Consistent Alignment to minimize variance among annotations and enhance consistency in high-confidence areas. Additionally, we incorporate a Committee-based Endogenous Knowledge Learning mechanism that uses adversarial soft supervision to simulate a reliable pseudo-ground truth, integrating Cross-Expert Fusion and Implicit Consensus Inference. Extensive experimental evaluations on various medical image segmentation datasets show that CaliDiff not only significantly improves the calibration of annotations but also achieves state-of-the-art performance, thereby enhancing the reliability and objectivity of medical diagnostics.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"107 ","pages":"Article 103812"},"PeriodicalIF":11.8000,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical image analysis","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1361841525003585","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Medical image segmentation is critical for accurate diagnostics and effective treatment planning. Traditional multi-rater labeling strategies, while integrating consensus from multiple experts, often do not fully capture the unique insights of individual raters. Moreover, deep discriminative models that aggregate such expert labels typically embed inherent biases into the segmentation results. To address these issues, we introduce CaliDiff, a novel multi-rater annotation calibration diffusion probabilistic model. This model effectively approximates the joint probability distribution among multiple expert annotations and their corresponding images, fully leveraging diverse expert knowledge while actively refining these annotations to approximate the true underlying distribution closely. CaliDiff operates through a structured multi-stage process: it begins with a shared-parameter inverse diffusion to normalize initial expert biases, followed by Expertness Consistent Alignment to minimize variance among annotations and enhance consistency in high-confidence areas. Additionally, we incorporate a Committee-based Endogenous Knowledge Learning mechanism that uses adversarial soft supervision to simulate a reliable pseudo-ground truth, integrating Cross-Expert Fusion and Implicit Consensus Inference. Extensive experimental evaluations on various medical image segmentation datasets show that CaliDiff not only significantly improves the calibration of annotations but also achieves state-of-the-art performance, thereby enhancing the reliability and objectivity of medical diagnostics.

查看原文本刊更多论文

caldiff：基于多因子标注的医学图像分割扩散概率模型

医学图像分割对于准确诊断和制定有效的治疗方案至关重要。传统的多评价者标签策略虽然整合了多位专家的共识，但往往不能完全捕捉到个别评价者的独特见解。此外，聚合这些专家标签的深度判别模型通常会将固有的偏见嵌入到分割结果中。为了解决这些问题，我们引入了一种新的多因子标注校准扩散概率模型CaliDiff。该模型有效地逼近了多个专家注释及其对应图像之间的联合概率分布，在充分利用多种专家知识的同时，积极地对这些注释进行提炼，以接近真实的底层分布。CaliDiff通过一个结构化的多阶段过程来运行：它从共享参数逆扩散开始，以标准化初始专家偏差，然后是expert Consistent Alignment，以最小化注释之间的方差，并增强高置信度区域的一致性。此外，我们结合了一个基于委员会的内生知识学习机制，该机制使用对抗性软监督来模拟可靠的伪地面真相，集成了跨专家融合和隐含共识推理。在各种医学图像分割数据集上的大量实验评估表明，caldiff不仅显著改善了标注的校准，而且达到了最先进的性能，从而提高了医学诊断的可靠性和客观性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Medical image analysis 工程技术-工程：生物医学

CiteScore

22.10

自引率

6.40%

发文量

309

审稿时长

6.6 months

期刊介绍： Medical Image Analysis serves as a platform for sharing new research findings in the realm of medical and biological image analysis, with a focus on applications of computer vision, virtual reality, and robotics to biomedical imaging challenges. The journal prioritizes the publication of high-quality, original papers contributing to the fundamental science of processing, analyzing, and utilizing medical and biological images. It welcomes approaches utilizing biomedical image datasets across all spatial scales, from molecular/cellular imaging to tissue/organ imaging.