Optimizing stroke lesion segmentation: A dual-approach using Gaussian mixture models and nnU-Net

IF 7 2区 医学 Q1 BIOLOGY
Adrian Mannel , Dhaval Khunt , Vaibhav Agrawal , Kristin Schelling , Eduardo Calderón , Christian la Fougère , Salvador Castaneda-Vega
{"title":"Optimizing stroke lesion segmentation: A dual-approach using Gaussian mixture models and nnU-Net","authors":"Adrian Mannel ,&nbsp;Dhaval Khunt ,&nbsp;Vaibhav Agrawal ,&nbsp;Kristin Schelling ,&nbsp;Eduardo Calderón ,&nbsp;Christian la Fougère ,&nbsp;Salvador Castaneda-Vega","doi":"10.1016/j.compbiomed.2025.110221","DOIUrl":null,"url":null,"abstract":"<div><div>Machine learning-based stroke lesion segmentation models are widely used in biomedical imaging, but their ability to detect treatment effects remains largely unexplored. Gaussian Mixture Models (GMM) and nnU-Net are among the most prominent and well-established segmentation workflows. GMM has been widely used for probabilistic tissue classification for decades, while nnU-Net has established itself as a leading deep learning framework for biomedical image segmentation, with hundreds of applications in preclinical and clinical research. Despite their widespread adoption, these methods are typically evaluated using segmentation metrics alone, without assessing their reliability in detecting therapy-induced changes - a critical factor for translational research and clinical decision-making.</div><div>In this study, we systematically evaluate GMM and nnU-Net to determine their effectiveness in identifying therapy-related changes in stroke volume. Both methods demonstrate strong segmentation performance; however, nnU-Net trained solely on manual segmentations fails to detect significant therapy-induced stroke volume reductions, leading to false negative study outcomes despite achieving excellent segmentation metrics. This limitation is particularly relevant given the increasing integration of nnU-Net into biomedical research, multi-center trials and clinical workflows.</div><div>To further investigate this issue, we evaluated nnU-Net trained with GMM-derived ground truth (GT) labels and observed that it more accurately detected therapy response compared to training with Manual-GT. These results illustrate how different GT definitions can influence model performance in therapy assessment. While the integration of probabilistic methods with deep learning has been previously described, our results demonstrate its practical impact in a controlled experimental setting. By systematically evaluating two widely used segmentation methods under therapy conditions, this study highlights the importance of considering therapy detection as a key evaluation criterion, rather than relying solely on segmentation accuracy.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"192 ","pages":"Article 110221"},"PeriodicalIF":7.0000,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in biology and medicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010482525005724","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Machine learning-based stroke lesion segmentation models are widely used in biomedical imaging, but their ability to detect treatment effects remains largely unexplored. Gaussian Mixture Models (GMM) and nnU-Net are among the most prominent and well-established segmentation workflows. GMM has been widely used for probabilistic tissue classification for decades, while nnU-Net has established itself as a leading deep learning framework for biomedical image segmentation, with hundreds of applications in preclinical and clinical research. Despite their widespread adoption, these methods are typically evaluated using segmentation metrics alone, without assessing their reliability in detecting therapy-induced changes - a critical factor for translational research and clinical decision-making.
In this study, we systematically evaluate GMM and nnU-Net to determine their effectiveness in identifying therapy-related changes in stroke volume. Both methods demonstrate strong segmentation performance; however, nnU-Net trained solely on manual segmentations fails to detect significant therapy-induced stroke volume reductions, leading to false negative study outcomes despite achieving excellent segmentation metrics. This limitation is particularly relevant given the increasing integration of nnU-Net into biomedical research, multi-center trials and clinical workflows.
To further investigate this issue, we evaluated nnU-Net trained with GMM-derived ground truth (GT) labels and observed that it more accurately detected therapy response compared to training with Manual-GT. These results illustrate how different GT definitions can influence model performance in therapy assessment. While the integration of probabilistic methods with deep learning has been previously described, our results demonstrate its practical impact in a controlled experimental setting. By systematically evaluating two widely used segmentation methods under therapy conditions, this study highlights the importance of considering therapy detection as a key evaluation criterion, rather than relying solely on segmentation accuracy.

Abstract Image

优化脑卒中病灶分割:高斯混合模型和nnU-Net的双重方法
基于机器学习的脑卒中病灶分割模型广泛应用于生物医学成像,但其检测治疗效果的能力在很大程度上仍未被探索。高斯混合模型(GMM)和nnU-Net是最突出和最完善的分割工作流程。几十年来,GMM已被广泛用于概率组织分类,而nnU-Net已成为生物医学图像分割的领先深度学习框架,在临床前和临床研究中有数百种应用。尽管这些方法被广泛采用,但通常仅使用分割指标来评估这些方法,而没有评估它们在检测治疗引起的变化方面的可靠性——这是转化研究和临床决策的关键因素。在这项研究中,我们系统地评估了GMM和nnU-Net,以确定它们在识别治疗相关的脑卒中容量变化方面的有效性。两种方法都表现出较强的分割性能;然而,仅通过人工分割训练的nnU-Net无法检测到治疗引起的显著脑卒中体积减少,导致假阴性研究结果,尽管取得了出色的分割指标。鉴于nnU-Net日益融入生物医学研究、多中心试验和临床工作流程,这一限制尤为重要。为了进一步研究这个问题,我们评估了使用gmm衍生的ground truth (GT)标签训练的nnU-Net,并观察到与使用Manual-GT训练相比,它更准确地检测到治疗反应。这些结果说明了不同的GT定义如何影响治疗评估中的模型性能。虽然概率方法与深度学习的集成之前已经描述过,但我们的研究结果证明了它在受控实验环境中的实际影响。通过对治疗条件下两种广泛使用的分割方法进行系统评价,本研究强调了将治疗检测作为关键评价标准的重要性,而不仅仅是依赖分割的准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Computers in biology and medicine
Computers in biology and medicine 工程技术-工程:生物医学
CiteScore
11.70
自引率
10.40%
发文量
1086
审稿时长
74 days
期刊介绍: Computers in Biology and Medicine is an international forum for sharing groundbreaking advancements in the use of computers in bioscience and medicine. This journal serves as a medium for communicating essential research, instruction, ideas, and information regarding the rapidly evolving field of computer applications in these domains. By encouraging the exchange of knowledge, we aim to facilitate progress and innovation in the utilization of computers in biology and medicine.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信