Performance evaluation of deep learning algorithms in MRI breast lesion segmentation and detection

IF 4.9 2区医学 Q1 ENGINEERING, BIOMEDICAL

Biomedical Signal Processing and Control Pub Date : 2025-10-10 DOI:10.1016/j.bspc.2025.108853

Jupeng Zhang , Qi Wu , Jinhua Hu , Xiqi Zhu , Baosheng Li

{"title":"Performance evaluation of deep learning algorithms in MRI breast lesion segmentation and detection","authors":"Jupeng Zhang , Qi Wu , Jinhua Hu , Xiqi Zhu , Baosheng Li","doi":"10.1016/j.bspc.2025.108853","DOIUrl":null,"url":null,"abstract":"<div><h3>Purpose</h3><div>This study systematically evaluates the efficacy of deep learning (DL) algorithms for segmenting and detecting breast lesions in magnetic resonance imaging (MRI), focusing on segmentation accuracy and clinical applicability.</div></div><div><h3>Methods</h3><div>Following PRISMA-DTA guidelines, we searched PubMed, Embase, Scopus, and Web of Science, identifying 19 eligible studies. Inclusion criteria included MRI studies using DL for breast lesion segmentation and detection, with comprehensive data on segmentation efficacy. Study quality was assessed using QUADAS-AI. Meta-analysis was performed using random-effects modeling, with segmentation accuracy quantified by the Dice similarity coefficient (DSC) and lesion detection efficacy by sensitivity. Heterogeneity was explored through <em>meta</em>-regression and subgroup analysis.</div></div><div><h3>Results</h3><div>The 19 studies evaluated DL algorithms like U-Net, nnU-Net, and CNN. DSC for segmentation ranged from 0.61 to 0.97, with a pooled DSC of 0.82 (95 % CI: 0.76–0.88). Pooled sensitivity across six studies was 0.86 (95 % CI: 0.75–0.98). Subgroup analyses showed higher accuracy in multicenter studies (0.86 vs. 0.80), studies with external validation (0.89 vs. 0.79), and 3.0 T MRI devices (0.88 vs. 0.83). Intensity normalization also improved accuracy (0.87 vs. 0.79). nnU-Net achieved the highest DSC (0.97). Significant heterogeneity (I<sup>2</sup> = 99.6 %) and publication bias (p = 0.018) were observed.</div></div><div><h3>Conclusion</h3><div>DL algorithms show high accuracy in breast lesion segmentation and detection, particularly in multicenter studies and those with external validation. Future research should optimize algorithms to reduce heterogeneity and validate clinical applicability.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"112 ","pages":"Article 108853"},"PeriodicalIF":4.9000,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomedical Signal Processing and Control","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1746809425013643","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Purpose

This study systematically evaluates the efficacy of deep learning (DL) algorithms for segmenting and detecting breast lesions in magnetic resonance imaging (MRI), focusing on segmentation accuracy and clinical applicability.

Methods

Following PRISMA-DTA guidelines, we searched PubMed, Embase, Scopus, and Web of Science, identifying 19 eligible studies. Inclusion criteria included MRI studies using DL for breast lesion segmentation and detection, with comprehensive data on segmentation efficacy. Study quality was assessed using QUADAS-AI. Meta-analysis was performed using random-effects modeling, with segmentation accuracy quantified by the Dice similarity coefficient (DSC) and lesion detection efficacy by sensitivity. Heterogeneity was explored through meta-regression and subgroup analysis.

Results

The 19 studies evaluated DL algorithms like U-Net, nnU-Net, and CNN. DSC for segmentation ranged from 0.61 to 0.97, with a pooled DSC of 0.82 (95 % CI: 0.76–0.88). Pooled sensitivity across six studies was 0.86 (95 % CI: 0.75–0.98). Subgroup analyses showed higher accuracy in multicenter studies (0.86 vs. 0.80), studies with external validation (0.89 vs. 0.79), and 3.0 T MRI devices (0.88 vs. 0.83). Intensity normalization also improved accuracy (0.87 vs. 0.79). nnU-Net achieved the highest DSC (0.97). Significant heterogeneity (I² = 99.6 %) and publication bias (p = 0.018) were observed.

Conclusion

DL algorithms show high accuracy in breast lesion segmentation and detection, particularly in multicenter studies and those with external validation. Future research should optimize algorithms to reduce heterogeneity and validate clinical applicability.

查看原文本刊更多论文

深度学习算法在MRI乳腺病变分割与检测中的性能评价

目的系统评价深度学习（DL）算法在磁共振成像（MRI）中对乳腺病变的分割和检测效果，重点关注分割的准确性和临床适用性。方法遵循PRISMA-DTA指南，检索PubMed、Embase、Scopus和Web of Science，确定19项符合条件的研究。纳入标准包括使用DL进行乳腺病变分割和检测的MRI研究，以及关于分割效果的综合数据。采用QUADAS-AI评估研究质量。采用随机效应模型进行meta分析，以Dice相似系数（DSC）量化分割精度，以灵敏度量化病变检测效果。通过meta回归和亚组分析探讨异质性。结果19项研究评估了U-Net、nnU-Net和CNN等深度学习算法。分割的DSC范围为0.61至0.97，合并DSC为0.82 （95% CI: 0.76-0.88）。6项研究的总敏感性为0.86 （95% CI: 0.75-0.98）。亚组分析显示，多中心研究（0.86比0.80）、外部验证研究（0.89比0.79）和3.0 T MRI设备（0.88比0.83）的准确性更高。强度归一化也提高了准确性（0.87比0.79）。nnU-Net的DSC最高（0.97）。观察到显著的异质性（I2 = 99.6%）和发表偏倚（p = 0.018）。结论dl算法对乳腺病变的分割和检测具有较高的准确性，特别是在多中心研究和有外部验证的研究中。未来的研究应优化算法以减少异质性并验证临床适用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Biomedical Signal Processing and Control 工程技术-工程：生物医学

CiteScore

9.80

自引率

13.70%

发文量

822

审稿时长

4 months

期刊介绍： Biomedical Signal Processing and Control aims to provide a cross-disciplinary international forum for the interchange of information on research in the measurement and analysis of signals and images in clinical medicine and the biological sciences. Emphasis is placed on contributions dealing with the practical, applications-led research on the use of methods and devices in clinical diagnosis, patient monitoring and management. Biomedical Signal Processing and Control reflects the main areas in which these methods are being used and developed at the interface of both engineering and clinical science. The scope of the journal is defined to include relevant review papers, technical notes, short communications and letters. Tutorial papers and special issues will also be published.