扩散语义分割模型：一种基于联合分布的医学图像分割生成模型。

IF 3.2 2区医学 Q1 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING

Medical physics Pub Date : 2025-06-08 DOI:10.1002/mp.17928

Tiange Liu, Jinze Li, Drew A. Torigian, Yubing Tong, Qibing Xiong, Kaige Zhang, Jayaram K. Udupa

{"title":"扩散语义分割模型：一种基于联合分布的医学图像分割生成模型。","authors":"Tiange Liu, Jinze Li, Drew A. Torigian, Yubing Tong, Qibing Xiong, Kaige Zhang, Jayaram K. Udupa","doi":"10.1002/mp.17928","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Background</h3>\n \n <p>The mainstream semantic segmentation schemes in medical image segmentation are essentially discriminative paradigms based on conditional distributions <span></span><math>\n <semantics>\n <mrow>\n <mi>p</mi>\n <mo>(</mo>\n <mrow>\n <mi>c</mi>\n <mi>l</mi>\n <mi>a</mi>\n <mi>s</mi>\n <mi>s</mi>\n <mo>|</mo>\n <mi>f</mi>\n <mi>e</mi>\n <mi>a</mi>\n <mi>t</mi>\n <mi>u</mi>\n <mi>r</mi>\n <mi>e</mi>\n </mrow>\n <mo>)</mo>\n </mrow>\n <annotation>$p( {class|feature} )$</annotation>\n </semantics></math>. Although efficient and straightforward, this prevalent paradigm focuses solely on extracting image features while ignoring the underlying data distribution <span></span><math>\n <semantics>\n <mrow>\n <mi>p</mi>\n <mo>(</mo>\n <mrow>\n <mi>f</mi>\n <mi>e</mi>\n <mi>a</mi>\n <mi>t</mi>\n <mi>u</mi>\n <mi>r</mi>\n <mi>e</mi>\n <mo>|</mo>\n <mi>c</mi>\n <mi>l</mi>\n <mi>a</mi>\n <mi>s</mi>\n <mi>s</mi>\n </mrow>\n <mo>)</mo>\n </mrow>\n <annotation>$p( {feature|class} )$</annotation>\n </semantics></math>. Therefore, the learned feature space exhibits inherent instability, which directly affects the precision of the model in delineating anatomical boundaries.</p>\n </section>\n \n <section>\n \n <h3> Purpose</h3>\n \n <p>This paper reformulates the semantic segmentation task as a distribution alignment problem for medical image segmentation, aiming to minimize the gap between model predictions and ground truth labels by modeling the joint distribution of the data.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>We propose a novel segmentation architecture based on joint distribution, called Denoising Semantic Segmentation Model (DSSM). We propose learning classification decision boundaries in pixel feature space and modeling joint distributions in latent feature space. Specifically, DSSM optimizes probability maps based on pixel feature classification through Bayesian posterior probabilities. To this end, we design a Feature Fusion Module (FFM) to guide the generative module in inference and provide label features for the semantic module. Furthermore, we introduce a stable Markov inference process to reduce inference offset. Finally, the joint distribution-based model is end-to-end trained in a discriminative manner, that is, maximizing <span></span><math>\n <semantics>\n <mrow>\n <mi>p</mi>\n <mo>(</mo>\n <mrow>\n <mi>c</mi>\n <mi>l</mi>\n <mi>a</mi>\n <mi>s</mi>\n <mi>s</mi>\n <mo>|</mo>\n <mi>f</mi>\n <mi>e</mi>\n <mi>a</mi>\n <mi>t</mi>\n <mi>u</mi>\n <mi>r</mi>\n <mi>e</mi>\n </mrow>\n <mo>)</mo>\n </mrow>\n <annotation>$p( {class|feature} )$</annotation>\n </semantics></math>, which endows DSSM with the strengths of both generative and discriminative models.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>The image datasets utilized in this study are from different modalities, including MRI scans, x-ray images, and skin lesion photographic images, demonstrating superior performance compared to state-of-the-art (SOTA) discriminative models. Specifically, DSSM achieved a Dice coefficient of 0.8871 in MSD cardiac MRI segmentation, 0.9451 in ACDC left ventricular MRI segmentation, and 0.9647 in x-ray image segmentation. DSSM also reached 0.8731 Dice in prostate MRI segmentation. Furthermore, in the field of skin lesion segmentation, DSSM achieved a Dice score of 0.8869 on the ISIC 2018 dataset and delivered exceptional performance with 0.9421 on the PH2 dataset. Besides the Dice score, HD95, mIoU, Precision, and Recall are evaluated across the above datasets, which further demonstrate the superior performance of DSSM.</p>\n </section>\n \n <section>\n \n <h3> Conclusions</h3>\n \n <p>Our methodology enables the stabilization of the learned feature space by effectively capturing the latent feature distribution information. Experimental results demonstrate that our model considerably outperforms traditional discriminative segmentation methods across a variety of datasets from multiple modalities.</p>\n </section>\n </div>","PeriodicalId":18384,"journal":{"name":"Medical physics","volume":"52 7","pages":""},"PeriodicalIF":3.2000,"publicationDate":"2025-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Diffusion semantic segmentation model: A generative model for medical image segmentation based on joint distribution\",\"authors\":\"Tiange Liu, Jinze Li, Drew A. Torigian, Yubing Tong, Qibing Xiong, Kaige Zhang, Jayaram K. Udupa\",\"doi\":\"10.1002/mp.17928\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n \\n <section>\\n \\n <h3> Background</h3>\\n \\n <p>The mainstream semantic segmentation schemes in medical image segmentation are essentially discriminative paradigms based on conditional distributions <span></span><math>\\n <semantics>\\n <mrow>\\n <mi>p</mi>\\n <mo>(</mo>\\n <mrow>\\n <mi>c</mi>\\n <mi>l</mi>\\n <mi>a</mi>\\n <mi>s</mi>\\n <mi>s</mi>\\n <mo>|</mo>\\n <mi>f</mi>\\n <mi>e</mi>\\n <mi>a</mi>\\n <mi>t</mi>\\n <mi>u</mi>\\n <mi>r</mi>\\n <mi>e</mi>\\n </mrow>\\n <mo>)</mo>\\n </mrow>\\n <annotation>$p( {class|feature} )$</annotation>\\n </semantics></math>. Although efficient and straightforward, this prevalent paradigm focuses solely on extracting image features while ignoring the underlying data distribution <span></span><math>\\n <semantics>\\n <mrow>\\n <mi>p</mi>\\n <mo>(</mo>\\n <mrow>\\n <mi>f</mi>\\n <mi>e</mi>\\n <mi>a</mi>\\n <mi>t</mi>\\n <mi>u</mi>\\n <mi>r</mi>\\n <mi>e</mi>\\n <mo>|</mo>\\n <mi>c</mi>\\n <mi>l</mi>\\n <mi>a</mi>\\n <mi>s</mi>\\n <mi>s</mi>\\n </mrow>\\n <mo>)</mo>\\n </mrow>\\n <annotation>$p( {feature|class} )$</annotation>\\n </semantics></math>. Therefore, the learned feature space exhibits inherent instability, which directly affects the precision of the model in delineating anatomical boundaries.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Purpose</h3>\\n \\n <p>This paper reformulates the semantic segmentation task as a distribution alignment problem for medical image segmentation, aiming to minimize the gap between model predictions and ground truth labels by modeling the joint distribution of the data.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Methods</h3>\\n \\n <p>We propose a novel segmentation architecture based on joint distribution, called Denoising Semantic Segmentation Model (DSSM). We propose learning classification decision boundaries in pixel feature space and modeling joint distributions in latent feature space. Specifically, DSSM optimizes probability maps based on pixel feature classification through Bayesian posterior probabilities. To this end, we design a Feature Fusion Module (FFM) to guide the generative module in inference and provide label features for the semantic module. Furthermore, we introduce a stable Markov inference process to reduce inference offset. Finally, the joint distribution-based model is end-to-end trained in a discriminative manner, that is, maximizing <span></span><math>\\n <semantics>\\n <mrow>\\n <mi>p</mi>\\n <mo>(</mo>\\n <mrow>\\n <mi>c</mi>\\n <mi>l</mi>\\n <mi>a</mi>\\n <mi>s</mi>\\n <mi>s</mi>\\n <mo>|</mo>\\n <mi>f</mi>\\n <mi>e</mi>\\n <mi>a</mi>\\n <mi>t</mi>\\n <mi>u</mi>\\n <mi>r</mi>\\n <mi>e</mi>\\n </mrow>\\n <mo>)</mo>\\n </mrow>\\n <annotation>$p( {class|feature} )$</annotation>\\n </semantics></math>, which endows DSSM with the strengths of both generative and discriminative models.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Results</h3>\\n \\n <p>The image datasets utilized in this study are from different modalities, including MRI scans, x-ray images, and skin lesion photographic images, demonstrating superior performance compared to state-of-the-art (SOTA) discriminative models. Specifically, DSSM achieved a Dice coefficient of 0.8871 in MSD cardiac MRI segmentation, 0.9451 in ACDC left ventricular MRI segmentation, and 0.9647 in x-ray image segmentation. DSSM also reached 0.8731 Dice in prostate MRI segmentation. Furthermore, in the field of skin lesion segmentation, DSSM achieved a Dice score of 0.8869 on the ISIC 2018 dataset and delivered exceptional performance with 0.9421 on the PH2 dataset. Besides the Dice score, HD95, mIoU, Precision, and Recall are evaluated across the above datasets, which further demonstrate the superior performance of DSSM.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Conclusions</h3>\\n \\n <p>Our methodology enables the stabilization of the learned feature space by effectively capturing the latent feature distribution information. Experimental results demonstrate that our model considerably outperforms traditional discriminative segmentation methods across a variety of datasets from multiple modalities.</p>\\n </section>\\n </div>\",\"PeriodicalId\":18384,\"journal\":{\"name\":\"Medical physics\",\"volume\":\"52 7\",\"pages\":\"\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2025-06-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical physics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/mp.17928\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical physics","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/mp.17928","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}

引用次数: 0

摘要

背景：医学图像分割中主流的语义分割方案本质上是基于条件分布p(class|feature)$ p({class|feature})$的判别范式。尽管这种流行的范式既有效又直接，但它只关注提取图像特征，而忽略了底层数据分布p(f = 0)$ p({feature|class})$。因此，学习到的特征空间具有固有的不稳定性，这直接影响了模型在描绘解剖边界时的精度。目的：本文将语义分割任务重新表述为医学图像分割的分布对齐问题，旨在通过对数据的联合分布建模，最小化模型预测与地面真值标签之间的差距。方法：提出了一种基于联合分布的语义分割模型（DSSM）。提出在像素特征空间中学习分类决策边界，在潜在特征空间中建模联合分布。具体来说，DSSM通过贝叶斯后验概率优化基于像素特征分类的概率图。为此，我们设计了特征融合模块（Feature Fusion Module， FFM）来指导生成模块进行推理，并为语义模块提供标签特征。此外，我们引入了一个稳定的马尔可夫推理过程来减少推理偏移。最后，对基于联合分布的模型进行端到端的判别式训练，即最大化p(c) a (c) a (c) b (c) b (c) b (c) b (c) b (c) b (c) b (d) b (c) d (c) b特征})b，这使得DSSM同时具有生成模型和判别模型的优点。结果：本研究中使用的图像数据集来自不同的模式，包括MRI扫描，x射线图像和皮肤病变摄影图像，与最先进的（SOTA）判别模型相比，显示出优越的性能。其中，DSSM在MSD心脏MRI分割中的Dice系数为0.8871，在ACDC左心室MRI分割中的Dice系数为0.9451，在x线图像分割中的Dice系数为0.9647。前列腺MRI分割的DSSM也达到0.8731 Dice。此外，在皮肤病变分割领域，DSSM在ISIC 2018数据集上取得了0.8869的Dice分数，在PH2数据集上取得了0.9421的优异成绩。除了Dice分数，HD95, mIoU， Precision和Recall在上述数据集上进行了评估，这进一步证明了DSSM的优越性能。结论：我们的方法通过有效地捕获潜在的特征分布信息，使学习到的特征空间变得稳定。实验结果表明，我们的模型在多种模式的数据集上明显优于传统的判别分割方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Diffusion semantic segmentation model: A generative model for medical image segmentation based on joint distribution

Background

The mainstream semantic segmentation schemes in medical image segmentation are essentially discriminative paradigms based on conditional distributions $p (c l a s s | f e a t u r e)$ . Although efficient and straightforward, this prevalent paradigm focuses solely on extracting image features while ignoring the underlying data distribution $p (f e a t u r e | c l a s s)$ . Therefore, the learned feature space exhibits inherent instability, which directly affects the precision of the model in delineating anatomical boundaries.

Purpose

This paper reformulates the semantic segmentation task as a distribution alignment problem for medical image segmentation, aiming to minimize the gap between model predictions and ground truth labels by modeling the joint distribution of the data.

Methods

We propose a novel segmentation architecture based on joint distribution, called Denoising Semantic Segmentation Model (DSSM). We propose learning classification decision boundaries in pixel feature space and modeling joint distributions in latent feature space. Specifically, DSSM optimizes probability maps based on pixel feature classification through Bayesian posterior probabilities. To this end, we design a Feature Fusion Module (FFM) to guide the generative module in inference and provide label features for the semantic module. Furthermore, we introduce a stable Markov inference process to reduce inference offset. Finally, the joint distribution-based model is end-to-end trained in a discriminative manner, that is, maximizing $p (c l a s s | f e a t u r e)$ , which endows DSSM with the strengths of both generative and discriminative models.

Results

The image datasets utilized in this study are from different modalities, including MRI scans, x-ray images, and skin lesion photographic images, demonstrating superior performance compared to state-of-the-art (SOTA) discriminative models. Specifically, DSSM achieved a Dice coefficient of 0.8871 in MSD cardiac MRI segmentation, 0.9451 in ACDC left ventricular MRI segmentation, and 0.9647 in x-ray image segmentation. DSSM also reached 0.8731 Dice in prostate MRI segmentation. Furthermore, in the field of skin lesion segmentation, DSSM achieved a Dice score of 0.8869 on the ISIC 2018 dataset and delivered exceptional performance with 0.9421 on the PH2 dataset. Besides the Dice score, HD95, mIoU, Precision, and Recall are evaluated across the above datasets, which further demonstrate the superior performance of DSSM.

Conclusions

Our methodology enables the stabilization of the learned feature space by effectively capturing the latent feature distribution information. Experimental results demonstrate that our model considerably outperforms traditional discriminative segmentation methods across a variety of datasets from multiple modalities.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Medical physics 医学-核医学

CiteScore

6.80

自引率

15.80%

发文量

660

审稿时长

1.7 months

期刊介绍： Medical Physics publishes original, high impact physics, imaging science, and engineering research that advances patient diagnosis and therapy through contributions in 1) Basic science developments with high potential for clinical translation 2) Clinical applications of cutting edge engineering and physics innovations 3) Broadly applicable and innovative clinical physics developments Medical Physics is a journal of global scope and reach. By publishing in Medical Physics your research will reach an international, multidisciplinary audience including practicing medical physicists as well as physics- and engineering based translational scientists. We work closely with authors of promising articles to improve their quality.