HarmonicEchoNet：利用谐波卷积在胎儿心脏超声视频中进行自动标准平面检测

IF 11.8 1区医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Medical image analysis Pub Date : 2025-08-26 DOI:10.1016/j.media.2025.103758

Md Mostafa Kamal Sarker , Divyanshu Mishra , Mohammad Alsharid , Netzahualcoyotl Hernandez-Cruz , Rahul Ahuja , Olga Patey , Aris T. Papageorghiou , J. Alison Noble

{"title":"HarmonicEchoNet：利用谐波卷积在胎儿心脏超声视频中进行自动标准平面检测","authors":"Md Mostafa Kamal Sarker , Divyanshu Mishra , Mohammad Alsharid , Netzahualcoyotl Hernandez-Cruz , Rahul Ahuja , Olga Patey , Aris T. Papageorghiou , J. Alison Noble","doi":"10.1016/j.media.2025.103758","DOIUrl":null,"url":null,"abstract":"<div><div>Fetal echocardiography offers non-invasive and real-time imaging acquisition of fetal heart images to identify congenital heart conditions. Manual acquisition of standard heart views is time-consuming, whereas automated detection remains challenging due to high spatial similarity across anatomical views with subtle local image appearance variations. To address these challenges, we introduce a very lightweight frequency-guided deep learning-based model named HarmonicEchoNet that can automatically detect heart standard views in a transverse sweep or freehand ultrasound scan of the fetal heart.</div><div>HarmonicEchoNet uses harmonic convolution blocks (HCBs) and a harmonic spatial and channel squeeze-and-excitation (hscSE) module. The HCBs apply a Discrete Cosine Transform (DCT)-based harmonic decomposition to input features, which are then combined using learned weights. The hscSE module identifies significant regions in the spatial domain to improve feature extraction of the fetal heart anatomical structures, capturing both spatial and channel-wise dependencies in an ultrasound image. The combination of these modules improves model performance relative to recent CNN-based, transformer-based, and CNN+transformer-based image classification models.</div><div>We use four datasets from two private studies, PULSE (Perception Ultrasound by Learning Sonographic Experience) and CAIFE (Clinical Artificial Intelligence in Fetal Echocardiography), to develop and evaluate HarmonicEchoNet models. Experimental results show that HarmonicEchoNet is 10–15 times faster than ConvNeXt, DeiT, and VOLO, with an inference time of just 3.9 ms. It also achieves 2%–7% accuracy improvement in classifying fetal heart standard planes compared to these baselines. Furthermore, with just 19.9 million parameters compared to ConvNeXt’s 196.24 million, HarmonicEchoNet is nearly ten times more parameter-efficient.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"107 ","pages":"Article 103758"},"PeriodicalIF":11.8000,"publicationDate":"2025-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"HarmonicEchoNet: Leveraging harmonic convolutions for automated standard plane detection in fetal heart ultrasound videos\",\"authors\":\"Md Mostafa Kamal Sarker , Divyanshu Mishra , Mohammad Alsharid , Netzahualcoyotl Hernandez-Cruz , Rahul Ahuja , Olga Patey , Aris T. Papageorghiou , J. Alison Noble\",\"doi\":\"10.1016/j.media.2025.103758\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Fetal echocardiography offers non-invasive and real-time imaging acquisition of fetal heart images to identify congenital heart conditions. Manual acquisition of standard heart views is time-consuming, whereas automated detection remains challenging due to high spatial similarity across anatomical views with subtle local image appearance variations. To address these challenges, we introduce a very lightweight frequency-guided deep learning-based model named HarmonicEchoNet that can automatically detect heart standard views in a transverse sweep or freehand ultrasound scan of the fetal heart.</div><div>HarmonicEchoNet uses harmonic convolution blocks (HCBs) and a harmonic spatial and channel squeeze-and-excitation (hscSE) module. The HCBs apply a Discrete Cosine Transform (DCT)-based harmonic decomposition to input features, which are then combined using learned weights. The hscSE module identifies significant regions in the spatial domain to improve feature extraction of the fetal heart anatomical structures, capturing both spatial and channel-wise dependencies in an ultrasound image. The combination of these modules improves model performance relative to recent CNN-based, transformer-based, and CNN+transformer-based image classification models.</div><div>We use four datasets from two private studies, PULSE (Perception Ultrasound by Learning Sonographic Experience) and CAIFE (Clinical Artificial Intelligence in Fetal Echocardiography), to develop and evaluate HarmonicEchoNet models. Experimental results show that HarmonicEchoNet is 10–15 times faster than ConvNeXt, DeiT, and VOLO, with an inference time of just 3.9 ms. It also achieves 2%–7% accuracy improvement in classifying fetal heart standard planes compared to these baselines. Furthermore, with just 19.9 million parameters compared to ConvNeXt’s 196.24 million, HarmonicEchoNet is nearly ten times more parameter-efficient.</div></div>\",\"PeriodicalId\":18328,\"journal\":{\"name\":\"Medical image analysis\",\"volume\":\"107 \",\"pages\":\"Article 103758\"},\"PeriodicalIF\":11.8000,\"publicationDate\":\"2025-08-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical image analysis\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1361841525003056\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical image analysis","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1361841525003056","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

胎儿超声心动图提供胎儿心脏图像的非侵入性和实时成像采集，以识别先天性心脏病。手动获取标准心脏视图非常耗时，而自动检测仍然具有挑战性，因为解剖视图之间的空间相似性很高，局部图像外观变化很小。为了解决这些挑战，我们引入了一个非常轻量级的基于频率引导的深度学习模型，名为HarmonicEchoNet，它可以在胎儿心脏的横向扫描或徒手超声扫描中自动检测心脏标准视图。HarmonicEchoNet使用谐波卷积块（HCBs）和谐波空间和信道挤压和激励（hscSE）模块。hcb将基于离散余弦变换（DCT）的谐波分解应用于输入特征，然后使用学习到的权重进行组合。hscSE模块识别空间域中的重要区域，以改进胎儿心脏解剖结构的特征提取，捕获超声图像中的空间和通道依赖关系。相对于最近基于CNN、基于变压器和基于CNN+变压器的图像分类模型，这些模块的组合提高了模型的性能。我们使用来自两个私人研究的四个数据集，PULSE（通过学习超声经验感知超声）和cafe（胎儿超声心动图临床人工智能），来开发和评估HarmonicEchoNet模型。实验结果表明，HarmonicEchoNet比ConvNeXt、DeiT和VOLO快10-15倍，推理时间仅为3.9 ms。与这些基线相比，它在分类胎儿心脏标准平面方面的准确率也提高了2%-7%。此外，与ConvNeXt的1.9624亿个参数相比，HarmonicEchoNet只有1,990万个参数，参数效率几乎是ConvNeXt的10倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

HarmonicEchoNet: Leveraging harmonic convolutions for automated standard plane detection in fetal heart ultrasound videos

Fetal echocardiography offers non-invasive and real-time imaging acquisition of fetal heart images to identify congenital heart conditions. Manual acquisition of standard heart views is time-consuming, whereas automated detection remains challenging due to high spatial similarity across anatomical views with subtle local image appearance variations. To address these challenges, we introduce a very lightweight frequency-guided deep learning-based model named HarmonicEchoNet that can automatically detect heart standard views in a transverse sweep or freehand ultrasound scan of the fetal heart.

HarmonicEchoNet uses harmonic convolution blocks (HCBs) and a harmonic spatial and channel squeeze-and-excitation (hscSE) module. The HCBs apply a Discrete Cosine Transform (DCT)-based harmonic decomposition to input features, which are then combined using learned weights. The hscSE module identifies significant regions in the spatial domain to improve feature extraction of the fetal heart anatomical structures, capturing both spatial and channel-wise dependencies in an ultrasound image. The combination of these modules improves model performance relative to recent CNN-based, transformer-based, and CNN+transformer-based image classification models.

We use four datasets from two private studies, PULSE (Perception Ultrasound by Learning Sonographic Experience) and CAIFE (Clinical Artificial Intelligence in Fetal Echocardiography), to develop and evaluate HarmonicEchoNet models. Experimental results show that HarmonicEchoNet is 10–15 times faster than ConvNeXt, DeiT, and VOLO, with an inference time of just 3.9 ms. It also achieves 2%–7% accuracy improvement in classifying fetal heart standard planes compared to these baselines. Furthermore, with just 19.9 million parameters compared to ConvNeXt’s 196.24 million, HarmonicEchoNet is nearly ten times more parameter-efficient.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Medical image analysis 工程技术-工程：生物医学

CiteScore

22.10

自引率

6.40%

发文量

309

审稿时长

6.6 months

期刊介绍： Medical Image Analysis serves as a platform for sharing new research findings in the realm of medical and biological image analysis, with a focus on applications of computer vision, virtual reality, and robotics to biomedical imaging challenges. The journal prioritizes the publication of high-quality, original papers contributing to the fundamental science of processing, analyzing, and utilizing medical and biological images. It welcomes approaches utilizing biomedical image datasets across all spatial scales, from molecular/cellular imaging to tissue/organ imaging.