Interactively Fusing Global and Local Features for Benign and Malignant Classification of Breast Ultrasound Images

IF 2.4 3区医学 Q2 ACOUSTICS

Ultrasound in Medicine and Biology Pub Date : 2024-12-20 DOI:10.1016/j.ultrasmedbio.2024.11.014

Wenhan Wang , Jiale Zhou , Jin Zhao , Xun Lin , Yan Zhang , Shan Lu , Wanchen Zhao , Shuai Wang , Wenzhong Tang , Xiaolei Qu

{"title":"Interactively Fusing Global and Local Features for Benign and Malignant Classification of Breast Ultrasound Images","authors":"Wenhan Wang , Jiale Zhou , Jin Zhao , Xun Lin , Yan Zhang , Shan Lu , Wanchen Zhao , Shuai Wang , Wenzhong Tang , Xiaolei Qu","doi":"10.1016/j.ultrasmedbio.2024.11.014","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>Breast ultrasound (BUS) is used to classify benign and malignant breast tumors, and its automatic classification can reduce subjectivity. However, current convolutional neural networks (CNNs) face challenges in capturing global features, while vision transformer (ViT) networks have limitations in effectively extracting local features. Therefore, this study aimed to develop a deep learning method that enables the interaction and updating of intermediate features between CNN and ViT to achieve high-accuracy BUS image classification.</div></div><div><h3>Methods</h3><div>This study introduced the CNN and transformer multi-stage fusion network (CTMF-Net) consisting of two branches: a CNN branch and a transformer branch. The CNN branch employs visual geometry group as its backbone, while the transformer branch utilizes ViT as its base network. Both branches were divided into four stages. At the end of each stage, a proposed feature interaction module facilitated feature interaction and fusion between the two branches. Additionally, the convolutional block attention module was employed to enhance relevant features after each stage of the CNN branch. Extensive experiments were conducted using various state-of-the-art deep-learning classification methods on three public breast ultrasound datasets (SYSU, UDIAT and BUSI).</div></div><div><h3>Results</h3><div>For the internal validation on SYSU and UDIAT, our proposed method CTMF-Net achieved the highest accuracy of 90.14 ± 0.58% on SYSU and 92.04 ± 4.90% on UDIAT, which showed superior classification performance over other state-of-art networks (<em>p</em> < 0.05). Additionally, for external validation on BUSI, CTMF-Net showed outstanding performance, achieving the highest area under the curve score of 0.8704 when trained on SYSU, marking a 0.0126 improvement over the second-best visual geometry group attention ViT method. Similarly, when applied to UDIAT, CTMF-Net achieved an area under the curve score of 0.8505, surpassing the second-best global context ViT method by 0.0130.</div></div><div><h3>Conclusion</h3><div>Our proposed method, CTMF-Net, outperforms all existing methods and can effectively assist doctors in achieving more accurate classification performance of breast tumors.</div></div>","PeriodicalId":49399,"journal":{"name":"Ultrasound in Medicine and Biology","volume":"51 3","pages":"Pages 525-534"},"PeriodicalIF":2.4000,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ultrasound in Medicine and Biology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0301562924004381","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ACOUSTICS","Score":null,"Total":0}

引用次数: 0

Abstract

Objective

Breast ultrasound (BUS) is used to classify benign and malignant breast tumors, and its automatic classification can reduce subjectivity. However, current convolutional neural networks (CNNs) face challenges in capturing global features, while vision transformer (ViT) networks have limitations in effectively extracting local features. Therefore, this study aimed to develop a deep learning method that enables the interaction and updating of intermediate features between CNN and ViT to achieve high-accuracy BUS image classification.

Methods

This study introduced the CNN and transformer multi-stage fusion network (CTMF-Net) consisting of two branches: a CNN branch and a transformer branch. The CNN branch employs visual geometry group as its backbone, while the transformer branch utilizes ViT as its base network. Both branches were divided into four stages. At the end of each stage, a proposed feature interaction module facilitated feature interaction and fusion between the two branches. Additionally, the convolutional block attention module was employed to enhance relevant features after each stage of the CNN branch. Extensive experiments were conducted using various state-of-the-art deep-learning classification methods on three public breast ultrasound datasets (SYSU, UDIAT and BUSI).

Results

For the internal validation on SYSU and UDIAT, our proposed method CTMF-Net achieved the highest accuracy of 90.14 ± 0.58% on SYSU and 92.04 ± 4.90% on UDIAT, which showed superior classification performance over other state-of-art networks (p < 0.05). Additionally, for external validation on BUSI, CTMF-Net showed outstanding performance, achieving the highest area under the curve score of 0.8704 when trained on SYSU, marking a 0.0126 improvement over the second-best visual geometry group attention ViT method. Similarly, when applied to UDIAT, CTMF-Net achieved an area under the curve score of 0.8505, surpassing the second-best global context ViT method by 0.0130.

Conclusion

Our proposed method, CTMF-Net, outperforms all existing methods and can effectively assist doctors in achieving more accurate classification performance of breast tumors.

查看原文本刊更多论文

交互式融合整体与局部特征用于乳腺超声图像良恶性分类。

目的：利用乳腺超声（BUS）对乳腺良恶性肿瘤进行自动分类，降低主观性。然而，当前的卷积神经网络（cnn）在捕获全局特征方面面临挑战，而视觉变换（ViT）网络在有效提取局部特征方面存在局限性。因此，本研究旨在开发一种深度学习方法，使CNN和ViT之间的中间特征相互作用和更新，以实现高精度的BUS图像分类。方法：本研究介绍了CNN与变压器多级融合网络（CTMF-Net），该网络由两个分支组成：CNN分支和变压器分支。CNN分支采用visual geometry group作为主干网络，变压器分支采用ViT作为基网。这两个分支都分为四个阶段。在每个阶段结束时，提出的特征交互模块促进了两个分支之间的特征交互和融合。此外，采用卷积块注意模块对CNN分支各阶段后的相关特征进行增强。在三个公共乳腺超声数据集（SYSU， UDIAT和BUSI）上使用各种最先进的深度学习分类方法进行了广泛的实验。结果：在SYSU和UDIAT的内部验证中，我们提出的方法ctfm - net在SYSU和UDIAT上的分类准确率分别为90.14±0.58%和92.04±4.90%，其分类性能优于其他先进网络（p < 0.05）。此外，在BUSI的外部验证中，CTMF-Net表现出了出色的表现，在SYSU上训练时，曲线下面积得分最高，为0.8704，比第二好的视觉几何组注意ViT方法提高了0.0126。同样，当应用于UDIAT时，CTMF-Net的曲线下面积得分为0.8505，比第二好的全局上下文ViT方法高出0.0130。结论：我们提出的CTMF-Net方法优于现有的所有方法，可以有效地帮助医生获得更准确的乳腺肿瘤分类性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Ultrasound in Medicine and Biology 医学-核医学

CiteScore

6.20

自引率

6.90%

发文量

325

审稿时长

70 days

期刊介绍： Ultrasound in Medicine and Biology is the official journal of the World Federation for Ultrasound in Medicine and Biology. The journal publishes original contributions that demonstrate a novel application of an existing ultrasound technology in clinical diagnostic, interventional and therapeutic applications, new and improved clinical techniques, the physics, engineering and technology of ultrasound in medicine and biology, and the interactions between ultrasound and biological systems, including bioeffects. Papers that simply utilize standard diagnostic ultrasound as a measuring tool will be considered out of scope. Extended critical reviews of subjects of contemporary interest in the field are also published, in addition to occasional editorial articles, clinical and technical notes, book reviews, letters to the editor and a calendar of forthcoming meetings. It is the aim of the journal fully to meet the information and publication requirements of the clinicians, scientists, engineers and other professionals who constitute the biomedical ultrasonic community.