Md Mostafa Kamal Sarker , Divyanshu Mishra , Mohammad Alsharid , Netzahualcoyotl Hernandez-Cruz , Rahul Ahuja , Olga Patey , Aris T. Papageorghiou , J. Alison Noble
{"title":"HarmonicEchoNet:利用谐波卷积在胎儿心脏超声视频中进行自动标准平面检测","authors":"Md Mostafa Kamal Sarker , Divyanshu Mishra , Mohammad Alsharid , Netzahualcoyotl Hernandez-Cruz , Rahul Ahuja , Olga Patey , Aris T. Papageorghiou , J. Alison Noble","doi":"10.1016/j.media.2025.103758","DOIUrl":null,"url":null,"abstract":"<div><div>Fetal echocardiography offers non-invasive and real-time imaging acquisition of fetal heart images to identify congenital heart conditions. Manual acquisition of standard heart views is time-consuming, whereas automated detection remains challenging due to high spatial similarity across anatomical views with subtle local image appearance variations. To address these challenges, we introduce a very lightweight frequency-guided deep learning-based model named HarmonicEchoNet that can automatically detect heart standard views in a transverse sweep or freehand ultrasound scan of the fetal heart.</div><div>HarmonicEchoNet uses harmonic convolution blocks (HCBs) and a harmonic spatial and channel squeeze-and-excitation (hscSE) module. The HCBs apply a Discrete Cosine Transform (DCT)-based harmonic decomposition to input features, which are then combined using learned weights. The hscSE module identifies significant regions in the spatial domain to improve feature extraction of the fetal heart anatomical structures, capturing both spatial and channel-wise dependencies in an ultrasound image. The combination of these modules improves model performance relative to recent CNN-based, transformer-based, and CNN+transformer-based image classification models.</div><div>We use four datasets from two private studies, PULSE (Perception Ultrasound by Learning Sonographic Experience) and CAIFE (Clinical Artificial Intelligence in Fetal Echocardiography), to develop and evaluate HarmonicEchoNet models. Experimental results show that HarmonicEchoNet is 10–15 times faster than ConvNeXt, DeiT, and VOLO, with an inference time of just 3.9 ms. It also achieves 2%–7% accuracy improvement in classifying fetal heart standard planes compared to these baselines. Furthermore, with just 19.9 million parameters compared to ConvNeXt’s 196.24 million, HarmonicEchoNet is nearly ten times more parameter-efficient.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"107 ","pages":"Article 103758"},"PeriodicalIF":11.8000,"publicationDate":"2025-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"HarmonicEchoNet: Leveraging harmonic convolutions for automated standard plane detection in fetal heart ultrasound videos\",\"authors\":\"Md Mostafa Kamal Sarker , Divyanshu Mishra , Mohammad Alsharid , Netzahualcoyotl Hernandez-Cruz , Rahul Ahuja , Olga Patey , Aris T. Papageorghiou , J. Alison Noble\",\"doi\":\"10.1016/j.media.2025.103758\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Fetal echocardiography offers non-invasive and real-time imaging acquisition of fetal heart images to identify congenital heart conditions. Manual acquisition of standard heart views is time-consuming, whereas automated detection remains challenging due to high spatial similarity across anatomical views with subtle local image appearance variations. To address these challenges, we introduce a very lightweight frequency-guided deep learning-based model named HarmonicEchoNet that can automatically detect heart standard views in a transverse sweep or freehand ultrasound scan of the fetal heart.</div><div>HarmonicEchoNet uses harmonic convolution blocks (HCBs) and a harmonic spatial and channel squeeze-and-excitation (hscSE) module. The HCBs apply a Discrete Cosine Transform (DCT)-based harmonic decomposition to input features, which are then combined using learned weights. The hscSE module identifies significant regions in the spatial domain to improve feature extraction of the fetal heart anatomical structures, capturing both spatial and channel-wise dependencies in an ultrasound image. The combination of these modules improves model performance relative to recent CNN-based, transformer-based, and CNN+transformer-based image classification models.</div><div>We use four datasets from two private studies, PULSE (Perception Ultrasound by Learning Sonographic Experience) and CAIFE (Clinical Artificial Intelligence in Fetal Echocardiography), to develop and evaluate HarmonicEchoNet models. Experimental results show that HarmonicEchoNet is 10–15 times faster than ConvNeXt, DeiT, and VOLO, with an inference time of just 3.9 ms. It also achieves 2%–7% accuracy improvement in classifying fetal heart standard planes compared to these baselines. Furthermore, with just 19.9 million parameters compared to ConvNeXt’s 196.24 million, HarmonicEchoNet is nearly ten times more parameter-efficient.</div></div>\",\"PeriodicalId\":18328,\"journal\":{\"name\":\"Medical image analysis\",\"volume\":\"107 \",\"pages\":\"Article 103758\"},\"PeriodicalIF\":11.8000,\"publicationDate\":\"2025-08-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical image analysis\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1361841525003056\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical image analysis","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1361841525003056","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
HarmonicEchoNet: Leveraging harmonic convolutions for automated standard plane detection in fetal heart ultrasound videos
Fetal echocardiography offers non-invasive and real-time imaging acquisition of fetal heart images to identify congenital heart conditions. Manual acquisition of standard heart views is time-consuming, whereas automated detection remains challenging due to high spatial similarity across anatomical views with subtle local image appearance variations. To address these challenges, we introduce a very lightweight frequency-guided deep learning-based model named HarmonicEchoNet that can automatically detect heart standard views in a transverse sweep or freehand ultrasound scan of the fetal heart.
HarmonicEchoNet uses harmonic convolution blocks (HCBs) and a harmonic spatial and channel squeeze-and-excitation (hscSE) module. The HCBs apply a Discrete Cosine Transform (DCT)-based harmonic decomposition to input features, which are then combined using learned weights. The hscSE module identifies significant regions in the spatial domain to improve feature extraction of the fetal heart anatomical structures, capturing both spatial and channel-wise dependencies in an ultrasound image. The combination of these modules improves model performance relative to recent CNN-based, transformer-based, and CNN+transformer-based image classification models.
We use four datasets from two private studies, PULSE (Perception Ultrasound by Learning Sonographic Experience) and CAIFE (Clinical Artificial Intelligence in Fetal Echocardiography), to develop and evaluate HarmonicEchoNet models. Experimental results show that HarmonicEchoNet is 10–15 times faster than ConvNeXt, DeiT, and VOLO, with an inference time of just 3.9 ms. It also achieves 2%–7% accuracy improvement in classifying fetal heart standard planes compared to these baselines. Furthermore, with just 19.9 million parameters compared to ConvNeXt’s 196.24 million, HarmonicEchoNet is nearly ten times more parameter-efficient.
期刊介绍:
Medical Image Analysis serves as a platform for sharing new research findings in the realm of medical and biological image analysis, with a focus on applications of computer vision, virtual reality, and robotics to biomedical imaging challenges. The journal prioritizes the publication of high-quality, original papers contributing to the fundamental science of processing, analyzing, and utilizing medical and biological images. It welcomes approaches utilizing biomedical image datasets across all spatial scales, from molecular/cellular imaging to tissue/organ imaging.