QCResUNet：主题级和体素级分割质量联合预测

IF 11.8 1区医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Medical image analysis Pub Date : 2025-08-27 DOI:10.1016/j.media.2025.103718

Peijie Qiu , Satrajit Chakrabarty , Phuc Nguyen , Soumyendu Sekhar Ghosh , Aristeidis Sotiras

{"title":"QCResUNet：主题级和体素级分割质量联合预测","authors":"Peijie Qiu , Satrajit Chakrabarty , Phuc Nguyen , Soumyendu Sekhar Ghosh , Aristeidis Sotiras","doi":"10.1016/j.media.2025.103718","DOIUrl":null,"url":null,"abstract":"<div><div>Deep learning has made significant strides in automated brain tumor segmentation from magnetic resonance imaging (MRI) scans in recent years. However, the reliability of these tools is hampered by the presence of poor-quality segmentation outliers, particularly in out-of-distribution samples, making their implementation in clinical practice difficult. Therefore, there is a need for quality control (QC) to screen the quality of the segmentation results. Although numerous automatic QC methods have been developed for segmentation quality screening, most were designed for cardiac MRI segmentation, which involves a single modality and a single tissue type. Furthermore, most prior works only provided subject-level predictions of segmentation quality and did not identify erroneous parts segmentation that may require refinement. To address these limitations, we proposed a novel multi-task deep learning architecture, termed QCResUNet, which produces subject-level segmentation-quality measures as well as voxel-level segmentation error maps for each available tissue class. To validate the effectiveness of the proposed method, we conducted experiments on assessing its performance on evaluating the quality of two distinct segmentation tasks. First, we aimed to assess the quality of brain tumor segmentation results. For this task, we performed experiments on one internal (Brain Tumor Segmentation (BraTS) Challenge 2021, <span><math><mrow><mi>n</mi><mo>=</mo><mn>1</mn><mo>,</mo><mn>251</mn></mrow></math></span>) and two external datasets (BraTS Challenge 2023 in Sub-Saharan Africa Patient Population (BraTS-SSA), <span><math><mrow><mi>n</mi><mo>=</mo><mn>40</mn></mrow></math></span>; Washington University School of Medicine (WUSM), <span><math><mrow><mi>n</mi><mo>=</mo><mn>175</mn></mrow></math></span>). Specifically, we first performed a three-fold cross-validation on the internal dataset using segmentations generated by different methods at various quality levels, followed by an evaluation on the external datasets. Second, we aimed to evaluate the segmentation quality of cardiac Magnetic Resonance Imaging (MRI) data from the Automated Cardiac Diagnosis Challenge (ACDC, <span><math><mrow><mi>n</mi><mo>=</mo><mn>100</mn></mrow></math></span>). The proposed method achieved high performance in predicting subject-level segmentation-quality metrics and accurately identifying segmentation errors on a voxel basis. This has the potential to be used to guide human-in-the-loop feedback to improve segmentations in clinical settings.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"107 ","pages":"Article 103718"},"PeriodicalIF":11.8000,"publicationDate":"2025-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"QCResUNet: Joint subject-level and voxel-level segmentation quality prediction\",\"authors\":\"Peijie Qiu , Satrajit Chakrabarty , Phuc Nguyen , Soumyendu Sekhar Ghosh , Aristeidis Sotiras\",\"doi\":\"10.1016/j.media.2025.103718\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Deep learning has made significant strides in automated brain tumor segmentation from magnetic resonance imaging (MRI) scans in recent years. However, the reliability of these tools is hampered by the presence of poor-quality segmentation outliers, particularly in out-of-distribution samples, making their implementation in clinical practice difficult. Therefore, there is a need for quality control (QC) to screen the quality of the segmentation results. Although numerous automatic QC methods have been developed for segmentation quality screening, most were designed for cardiac MRI segmentation, which involves a single modality and a single tissue type. Furthermore, most prior works only provided subject-level predictions of segmentation quality and did not identify erroneous parts segmentation that may require refinement. To address these limitations, we proposed a novel multi-task deep learning architecture, termed QCResUNet, which produces subject-level segmentation-quality measures as well as voxel-level segmentation error maps for each available tissue class. To validate the effectiveness of the proposed method, we conducted experiments on assessing its performance on evaluating the quality of two distinct segmentation tasks. First, we aimed to assess the quality of brain tumor segmentation results. For this task, we performed experiments on one internal (Brain Tumor Segmentation (BraTS) Challenge 2021, <span><math><mrow><mi>n</mi><mo>=</mo><mn>1</mn><mo>,</mo><mn>251</mn></mrow></math></span>) and two external datasets (BraTS Challenge 2023 in Sub-Saharan Africa Patient Population (BraTS-SSA), <span><math><mrow><mi>n</mi><mo>=</mo><mn>40</mn></mrow></math></span>; Washington University School of Medicine (WUSM), <span><math><mrow><mi>n</mi><mo>=</mo><mn>175</mn></mrow></math></span>). Specifically, we first performed a three-fold cross-validation on the internal dataset using segmentations generated by different methods at various quality levels, followed by an evaluation on the external datasets. Second, we aimed to evaluate the segmentation quality of cardiac Magnetic Resonance Imaging (MRI) data from the Automated Cardiac Diagnosis Challenge (ACDC, <span><math><mrow><mi>n</mi><mo>=</mo><mn>100</mn></mrow></math></span>). The proposed method achieved high performance in predicting subject-level segmentation-quality metrics and accurately identifying segmentation errors on a voxel basis. This has the potential to be used to guide human-in-the-loop feedback to improve segmentations in clinical settings.</div></div>\",\"PeriodicalId\":18328,\"journal\":{\"name\":\"Medical image analysis\",\"volume\":\"107 \",\"pages\":\"Article 103718\"},\"PeriodicalIF\":11.8000,\"publicationDate\":\"2025-08-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical image analysis\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1361841525002658\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical image analysis","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1361841525002658","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

近年来，深度学习在从磁共振成像（MRI）扫描中自动分割脑肿瘤方面取得了重大进展。然而，这些工具的可靠性受到低质量分割异常值的存在的阻碍，特别是在分布外样本中，使其在临床实践中的实施变得困难。因此，需要质量控制（QC）来筛选分割结果的质量。虽然已经开发了许多用于分割质量筛选的自动QC方法，但大多数是为心脏MRI分割设计的，涉及单一模态和单一组织类型。此外，大多数先前的工作只提供了主题级别的分割质量预测，并没有识别可能需要改进的错误部分分割。为了解决这些限制，我们提出了一种新的多任务深度学习架构，称为QCResUNet，它为每个可用的组织类生成主题级分割质量度量以及体素级分割误差图。为了验证该方法的有效性，我们进行了实验，以评估其在评估两个不同分割任务的质量方面的性能。首先，我们旨在评估脑肿瘤分割结果的质量。为了完成这项任务，我们在一个内部数据集（Brain Tumor Segmentation (BraTS) Challenge 2021, n=1,251）和两个外部数据集(BraTS Challenge 2023 in撒哈拉以南非洲患者群体（BraTS- ssa）， n=40；华盛顿大学医学院（WUSM）， n=175)。具体来说，我们首先使用不同方法在不同质量水平上生成的分割对内部数据集进行了三重交叉验证，然后对外部数据集进行了评估。其次，我们旨在评估来自心脏自动诊断挑战赛（ACDC, n=100）的心脏磁共振成像（MRI）数据的分割质量。该方法在预测主题级分割质量指标和准确识别基于体素的分割错误方面具有很高的性能。这有可能被用于指导人类在循环反馈，以改善临床环境中的分割。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

QCResUNet: Joint subject-level and voxel-level segmentation quality prediction

Deep learning has made significant strides in automated brain tumor segmentation from magnetic resonance imaging (MRI) scans in recent years. However, the reliability of these tools is hampered by the presence of poor-quality segmentation outliers, particularly in out-of-distribution samples, making their implementation in clinical practice difficult. Therefore, there is a need for quality control (QC) to screen the quality of the segmentation results. Although numerous automatic QC methods have been developed for segmentation quality screening, most were designed for cardiac MRI segmentation, which involves a single modality and a single tissue type. Furthermore, most prior works only provided subject-level predictions of segmentation quality and did not identify erroneous parts segmentation that may require refinement. To address these limitations, we proposed a novel multi-task deep learning architecture, termed QCResUNet, which produces subject-level segmentation-quality measures as well as voxel-level segmentation error maps for each available tissue class. To validate the effectiveness of the proposed method, we conducted experiments on assessing its performance on evaluating the quality of two distinct segmentation tasks. First, we aimed to assess the quality of brain tumor segmentation results. For this task, we performed experiments on one internal (Brain Tumor Segmentation (BraTS) Challenge 2021,

n = 1, 251

) and two external datasets (BraTS Challenge 2023 in Sub-Saharan Africa Patient Population (BraTS-SSA),

n = 40

; Washington University School of Medicine (WUSM),

n = 175

). Specifically, we first performed a three-fold cross-validation on the internal dataset using segmentations generated by different methods at various quality levels, followed by an evaluation on the external datasets. Second, we aimed to evaluate the segmentation quality of cardiac Magnetic Resonance Imaging (MRI) data from the Automated Cardiac Diagnosis Challenge (ACDC,

n = 100

). The proposed method achieved high performance in predicting subject-level segmentation-quality metrics and accurately identifying segmentation errors on a voxel basis. This has the potential to be used to guide human-in-the-loop feedback to improve segmentations in clinical settings.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Medical image analysis 工程技术-工程：生物医学

CiteScore

22.10

自引率

6.40%

发文量

309

审稿时长

6.6 months

期刊介绍： Medical Image Analysis serves as a platform for sharing new research findings in the realm of medical and biological image analysis, with a focus on applications of computer vision, virtual reality, and robotics to biomedical imaging challenges. The journal prioritizes the publication of high-quality, original papers contributing to the fundamental science of processing, analyzing, and utilizing medical and biological images. It welcomes approaches utilizing biomedical image datasets across all spatial scales, from molecular/cellular imaging to tissue/organ imaging.