{"title":"Quality assessment of sports actions based on adaptive-UniFormer","authors":"Suxia Xing, Zheng Guo, Chongchong Yu, Kexian Li, Shihang Zhao","doi":"10.1016/j.dsp.2025.105549","DOIUrl":"10.1016/j.dsp.2025.105549","url":null,"abstract":"<div><div>Sports action quality assessment (AQA) presents significant challenging, requiring comprehensive evaluation of motion completeness, fluency, and difficulty level for accurate quality scoring. This paper proposes <strong>Adaptive-UniFormer,</strong> an innovative AQA network integrating an <strong>Adaptive Token Halting Mechanism (ATHM)</strong> based on the UniFormerV2 architecture. The framework introduces the <strong><em>Top-K</em> selection mechanism</strong> in local feature extraction to efficiently eliminate redundant background tokens, and ATHM in the global feature extraction to focus computation on action-related tokens, significantly reducing computational overhead. Final action classification and quality scores are generated through multi-stage feature fusion and a Multi-Layer Perceptron (MLP). Comprehensive experiments demonstrate superior performance, for action recognition, the model achieves 87.6 % Top-1 and 98.7 % Top-5 accuracy on the UCF101, while reducing computational costs by 46.5 % in FLOPs, along with 78.4 % Top-1 accuracy On HMDB51. For action quality assessment, it obtains average Spearman’s rank correlation coefficient of 0.8223 on AQA-7 and 0.9502 on MTL-AQA. In conclusion, the proposed <strong>Adaptive-UniFormer</strong> establishes new benchmarks for <strong>recognition accuracy, computational efficiency</strong>, and <strong>AQA performance</strong>, offering an effective solution for sports action analysis.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105549"},"PeriodicalIF":3.0,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144892062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimized spatial automatic color enhancement technique: A novel approach for color restoration in retinopathy of prematurity (ROP) retinal images","authors":"Akhilesh Kakade , Rajesh Kumar Dhanaraj","doi":"10.1016/j.dsp.2025.105548","DOIUrl":"10.1016/j.dsp.2025.105548","url":null,"abstract":"<div><div>Infant retinal images are crucial for clinicians in diagnosing pediatric retinal diseases such as Retinopathy of Prematurity (ROP). These images are highly inclined towards distortion due to various factors such as errors in imaging instruments, transmission channels, variable atmospheric and environmental conditions leading to degradation of image quality. Such distortions can manifest in different ways such as noise, backscattering, low saturation, poor contrast, low illumination, and blurring, compromising the effectiveness of retinal images, potentially limits the accurate ROP diagnosis and treatment. To address these challenges, we present an Optimized Spatial Automatic Color Enhancement (OS-ACE) technique which employs a locally adaptive enhancement algorithm that operates on individual color channels of the RGB retina image. By utilizing the convolutional based enhancement coupled with a power-law transformation, the technique selectively amplifies local contrast while preserving overall image balance. This local enhancement is further complemented by a normalization procedure to ensure the retention of true color perception and mitigate the introduction of artifacts. The study integrates the OS-ACE algorithm into the established framework of conformal mapping to develop a comprehensive processing pipeline which significantly enhances the quality of retinal images ensuring optimal visualization and better clinical decision for retinal disease treatment. The performance of proposed framework was evaluated quantitatively and qualitatively on 1205 ROP retinal image datasets. The method achieved accuracy 0.9846, F1 score 0.8362, Jaccard score 0.7186, recall 0.8091, and precision 0.8652 respectively. The image quality assessment (IQA) models achieved results of PSNR 28.8371, SSIM 0.8705, FSIM 0.7024, BRISQUE 35.8471 respectively. Compared to existing methods, such as Alimanov, A. et al.’s super resolution technique using deep learning, Shen Z., et al.’s COFE-Net, Bataineh, B. et al.’s multi-stage enhancement method, the proposed OS-ACE technique achieves better results in PSNR, SSIM, F1 score, recall, precision and accuracy highlighting the robustness and effectiveness in clinical applications.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105548"},"PeriodicalIF":3.0,"publicationDate":"2025-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144886773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Ultra-short-term wind power forecasting based on the MIFCformer model and a critical low wind speed region power revision strategy","authors":"Jiuyuan Huo , Wenyuan Bian , Chen Chang","doi":"10.1016/j.dsp.2025.105521","DOIUrl":"10.1016/j.dsp.2025.105521","url":null,"abstract":"<div><div>The inherent intermittency, randomness, and volatility of wind power generation pose significant challenges to the stable operation of large-scale power grids. Accurate ultra-short-term forecasting is crucial for maintaining grid safety. Most existing ultra-short-term wind power forecasting methods have limited effectiveness in modeling multi-scale temporal dependencies and often fail to address the inadequate modeling capability under critical low wind speed conditions. To address this issue, this paper proposes a method for ultra-short-term wind power prediction using Multiscale and Interactive Fusion Convolution Transformer and Critical Low Wind Speed Region Power Revision Strategy (MIFCformer-CRS), to improve power prediction accuracy. The MIFCformer model is designed for initial power prediction. Leveraging multi-scale inputs, ProbSparse self-attention from Informer model, and temporal convolutional networks, the model enhances its capacity to capture complex patterns, while interactive top-down fusion convolution ensures deep fusion of multi-scale features. The revision strategy addresses overestimation and lag in initial predictions within critical low wind speed regions, enabling precise power adjustments. The revision strategy begins with constructing a Decomposition and Mixing of Factors (DMF) model for wind speed prediction. The wind speed sequence is decomposed into subsequences, processed individually using Multilayer Perceptron (MLP), and reconstructed to ensure prediction accuracy while effectively incorporating the influence of external factors. Next, predicted wind speed and preliminary power outputs are input into the multi-factor dynamic revision function to adjust the preliminary power prediction results. The proposed model was validated using wind farm data from Yunnan Province, China. Our method significantly outperformed 12 benchmark models, including LSTM, Informer, FEDformer, DLinear, and the hybrid method WSTD-Autoformer. Compared with other models, the proposed approach achieved reductions in Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) ranging from 7.08% to 34.27%, and attained improvements in the highest coefficient of determination (R<sup>2</sup>) by 2.48% to 17.91%. These results demonstrate the superior prediction performance of the proposed model.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105521"},"PeriodicalIF":3.0,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144879380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wei Sun , Yaqi Wang , Xinbo Gao , Yibao Zhao , Yongchao Song , Zhiqiang Hou , Yanning Zhang
{"title":"Visible-infrared person re-identification via adaptive frequency mining and embedding","authors":"Wei Sun , Yaqi Wang , Xinbo Gao , Yibao Zhao , Yongchao Song , Zhiqiang Hou , Yanning Zhang","doi":"10.1016/j.dsp.2025.105526","DOIUrl":"10.1016/j.dsp.2025.105526","url":null,"abstract":"<div><div>Visible-infrared person re-identification (VI-ReID) is a challenging task in computer vision that aims to match individuals across images captured in visible and infrared modalities. Existing approaches typically focus on either image-level or feature-level alignment, yet often struggle to effectively bridge the modality gap. In this paper, we propose a novel frequency-aware representation learning framework that leverages the complementary properties of visible and infrared images in the frequency domain to generate diverse and informative embeddings, thereby reducing cross-modal discrepancies. Specifically, we first extract low- and high-frequency features from input representations, guided by adaptively decoupled spectral components. These features are then refined via a bidirectional modulation operator that promotes interaction between frequency components. Furthermore, we design a multistage knowledge fusion module to enhance the complementarity between global structures and fine-grained details across multiple frequency scales. Extensive experiments on public benchmark datasets demonstrate that our method significantly outperforms state-of-the-art approaches, validating its effectiveness and generalization capability in complex cross-modal scenarios.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105526"},"PeriodicalIF":3.0,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144860403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yanzhao Wang , Yanping Yao , Tongchi Zhou , Zhongyun Liu , Li Yan , Long Zhu
{"title":"Edge semantic collaboration network for salient object detection in optical remote sensing images","authors":"Yanzhao Wang , Yanping Yao , Tongchi Zhou , Zhongyun Liu , Li Yan , Long Zhu","doi":"10.1016/j.dsp.2025.105536","DOIUrl":"10.1016/j.dsp.2025.105536","url":null,"abstract":"<div><div>The rapid development of deep learning has promoted the development of salient object detection in optical remote sensing images (ORSI-SOD). However, ORSI-SOD faces many challenges, including the interference of color and shadow backgrounds, or the uncertainty of the number and scale of objects in optical remote sensing images (ORSIs). Most of the existing models have difficulty in establishing effective long-distance feature dependencies. To address this issue, we propose an edge semantic collaboration network (ESCNet). Specifically, ESCNet designs an Interactive Graph Inference Module (IGIM) to model channel interactions and capture long-distance semantic dependencies via graph inference. Then, a Semantic Feature Enhancement Module (SFEM) is adopted to refine the dependency information based on a composite attention mechanism. Simultaneously, a Multi-scale Edge Refinement Module (MERM) extracts precise boundaries using multi-scale feature refinement. Finally, the features produced at each stage are sequentially fed into the decoder and generate the final saliency maps. Extensive experiments on three public datasets (ORSSD, EORSSD, and ORSI-4199) confirm the superiority of the proposed ESCNet compared with state-of-the-art methods.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105536"},"PeriodicalIF":3.0,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144840915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-order information aggregation network for remote sensing small target detection","authors":"Jinli Zhong, Jianxun Zhang","doi":"10.1016/j.dsp.2025.105537","DOIUrl":"10.1016/j.dsp.2025.105537","url":null,"abstract":"<div><div>In critical fields such as intelligent transportation and national defense security, remote sensing small target detection technology plays an essential role. To effectively overcome the complexity of remote sensing scenes and the weak response of small-scale targets, this paper proposes a lightweight Multi-order Information Aggregation Network (MIANet). MIANet mainly consists of two parts: Cross-spatial Multi-order Information Aggregation Module (CMIAM) and Multi-dimensional Information Enhancement Module (MIEM). Inspired by the research on multi-order interactions in game theory within deep learning, CMIAM can aggregate low-order, mid-order, and high-order information, effectively improving the detection accuracy of small targets in complex remote sensing scenes. Based on the design philosophy of manifolds of interest, MIEM can effectively remove redundant information, and MIEM utilizes a three-branch structure to capture cross-dimensional information interaction, enriching feature representation and achieving the effect of information enhancement. We have validated the performance of our model on multiple remote sensing small target datasets including VEDAI, DIOR, NWPU-VHR10, MVRSD, and SIMD, and achieved excellent results. In particular, for the lightweight MIANet, the accuracy metric <span><math><mtext>m</mtext><mi>A</mi><msub><mrow><mi>P</mi></mrow><mrow><mn>50</mn></mrow></msub></math></span> reached 73.7% on the VEDAI dataset, surpassing the current SOTA method SuperYOLO for remote sensing small target detection on a single modality. On the NWPU-VHR10 dataset, MIANet outperformed SuperYOLO by 2.1% in the <span><math><mtext>m</mtext><mi>A</mi><msub><mrow><mi>P</mi></mrow><mrow><mn>50</mn></mrow></msub></math></span> metric and FFCA-YOLO by 2.2%. On the DIOR dataset, with a parameter count of 9.79M, MIANet achieved an <span><math><mtext>m</mtext><mi>A</mi><msub><mrow><mi>P</mi></mrow><mrow><mn>50</mn></mrow></msub></math></span> metric of 81.3% and an <span><math><mtext>m</mtext><mi>A</mi><msub><mrow><mi>P</mi></mrow><mrow><mn>50</mn><mo>:</mo><mn>95</mn></mrow></msub></math></span> of 60.9%, which demonstrate that our model exhibits strong robustness characteristics. Our code will be made publicly available on <span><span>https://github.com/Liro-o/MIANet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105537"},"PeriodicalIF":3.0,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144867330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yingxuan Guo , Yan Qiang , Qi Chen , Qing Li , Jijie Sun
{"title":"MSRA-Net: A multi-scale and region-aware network for breast cancer ultrasound image segmentation","authors":"Yingxuan Guo , Yan Qiang , Qi Chen , Qing Li , Jijie Sun","doi":"10.1016/j.dsp.2025.105534","DOIUrl":"10.1016/j.dsp.2025.105534","url":null,"abstract":"<div><div>Automated analysis of breast ultrasound images holds significant potential to improve the accuracy of early breast cancer diagnosis, enabling physicians to rapidly and precisely identify lesion areas and providing timely, scientifically grounded decision support for clinical treatment. However, the inherent challenges of breast ultrasound images—such as speckle noise, blurred lesion boundaries, and heterogeneous gray-scale distributions—make accurate lesion extraction difficult for traditional segmentation methods. Although deep convolutional neural networks (CNNs) have achieved remarkable progress in medical image segmentation, their limited local receptive fields often result in insufficient modeling of long-range spatial dependencies, hindering their ability to effectively handle the complex and variable morphology of breast lesions. To address these challenges, this study proposes a novel multi-scale and region-aware network (MSRA-Net) for breast cancer ultrasound image segmentation. In the encoder stage, the model incorporates a Multi-Scale Feature Extraction Module (MFEM), which leverages wavelet convolution (WTConv) with a large receptive field to efficiently capture morphological features of lesions at multiple scales. In the decoder stage, the model innovatively integrates a Global Region-Aware Block (GRAB) and a Boundary Feature Enhancement Block (BFEB). The GRAB employs Space-Adaptive Channel Reduction Attention (SCRA) to focus on the global features of lesions, while the BFEB enhances boundary depiction accuracy by separating and processing low-frequency and high-frequency features. Extensive experiments on three breast cancer ultrasound datasets, BUSI, BUS-BRA, and BUET_BUSD, demonstrate that the proposed network significantly outperforms state-of-the-art medical image segmentation methods for breast ultrasound lesion segmentation. Furthermore, ablation studies validate the effectiveness of the individual modules and underscore the robustness and clinical utility of the proposed approach.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105534"},"PeriodicalIF":3.0,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144858365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wenkai Xu , Wenxing Bao , Wei Feng , Kewen Qu , Xuan Ma , Xiaowu Zhang , Wenlong Wang
{"title":"DACDM-CR: Discriminative attention and cloud-aware dynamic mamba for SAR-assisted optical data cloud removal","authors":"Wenkai Xu , Wenxing Bao , Wei Feng , Kewen Qu , Xuan Ma , Xiaowu Zhang , Wenlong Wang","doi":"10.1016/j.dsp.2025.105522","DOIUrl":"10.1016/j.dsp.2025.105522","url":null,"abstract":"<div><div>Cloud contamination significantly diminishes the potential applications of optical remote sensing images in geosciences, whereas Synthetic Aperture Radar (SAR) images remain unaffected by such interference. Numerous approaches have sought to leverage information from SAR images to restore affected areas in optical images. However, these methods still have room for improvement in fully leveraging the synergistic potential of SAR and optical images while preserving the global consistency of the reconstructed images. This paper proposes a novel SAR-assisted cloud removal network for optical remote sensing images, which comprises two key stages: feature extraction and image reconstruction. The feature extraction stage involves extracting deep features from optical and SAR images, which are then integrated into a Discriminative Attention Feature Interaction (DAFI) module. This enables multimodal feature collaboration, effectively recovering missing textural information in cloud-contaminated regions. In the image reconstruction stage, a Dynamic Cloud-Adaptive MAMBA Gated Spatial-Channel Attention (DMA) module is employed, efficiently reconstructing global contextual information with linear computational complexity while restoring spatial and channel details in cloud-affected areas. To further improve visual quality, this study introduces a multi-scale cloud-adaptive perceptual loss function based on VGG19, specifically targeting cloud-contaminated regions across different scales. The proposed method is validated on the SEN12MSCR dataset and M3M-CR dataset, with experimental results demonstrating superior performance over existing algorithms in terms of peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), spectral angle mapper (SAM), and mean absolute error (MAE).</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105522"},"PeriodicalIF":3.0,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144826722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ali Mohamed Tahar Gouicem , Abdeldjalil Ouahabi , Mostepha Yahi , Sébastien Jacques
{"title":"Enhancing aircraft safety: Automated three-dimensional defect detection, localization and sizing in non-destructive testing","authors":"Ali Mohamed Tahar Gouicem , Abdeldjalil Ouahabi , Mostepha Yahi , Sébastien Jacques","doi":"10.1016/j.dsp.2025.105535","DOIUrl":"10.1016/j.dsp.2025.105535","url":null,"abstract":"<div><div>In most cases, non-destructive testing (NDT) techniques typically rely solely on two-dimensional image data for defect detection, particularly in CT imaging. This limitation hindered the ability to accurately reconstruct the exact three-dimensional form of defects. In this study, we propose solutions for three-dimensional image reconstruction, which is crucial in industrial non-destructive testing applications and in the aircraft industry. We introduce a new, fully automated method for detecting, locating, and sizing defects in the context of non-contact quality control in industry, specifically focusing on aircraft-type equipment. Our method was applied to a confidential database containing over 120,000 images from Tassili Work Airlines Company. This database was curated and labeled by senior experts in the field of diagnostics and non-destructive testing, and we compare our results with theirs. Our combined approach, utilizing expectation maximization and fuzzy inference penalty, proves to be effective in addressing the challenging inverse problem of three-dimensional computed tomography defect detection, localization, and dimensioning. This contributes to enhancing safety in aeronautical transportation by enabling accurate diagnosis of parts.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105535"},"PeriodicalIF":3.0,"publicationDate":"2025-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144867333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mengqing Mei , Yibo Li , Xuannan An , Zhiwei Ye , Liye Mei
{"title":"GMANet: Gate mamba attention for fetal head and pubic symphysis segmentation in ultrasound images analysis","authors":"Mengqing Mei , Yibo Li , Xuannan An , Zhiwei Ye , Liye Mei","doi":"10.1016/j.dsp.2025.105533","DOIUrl":"10.1016/j.dsp.2025.105533","url":null,"abstract":"<div><div>Accurate segmentation of fetal heads and the pubic symphysis (PSFH) in ultrasound images during childbirth is crucial for precise angle of progression (AoP) measurements, which enables clinicians to manage dystocia complications effectively. Conventional approaches relying on sonographer-dependent manual selection prove time-consuming and operator-sensitive, while concurrently coping with inherent ultrasound noise, anatomical occlusions, and substantial target shape or location variations. To overcome these challenges in small-target segmentation and boundary delineation, we present GMANet, a novel Mamba-based architecture. Our core design introduces the Gate Mamba Attention (GMA) that synergistically integrates selective state-space modeling with a gating mechanism, where sequence-aware attention of Mamba dynamically focuses on crucial spatial dependencies. At the same time, the fixed-parameter architecture maintains stable local feature extraction. Then we develop an Adaptive Pyramid Pooling Module (APPM) that enhances multiscale discriminability through parallel multi-depth pooling, effectively handling significant size disparities in medical targets. Subsequent feature refinement employs our Efficient Multiscale Attention (EMA) to aggregate multi-receptive-field context through parameter-efficient spatial-channel interactions adaptively. Finally, the proposed GMANet demonstrates statistically significant advantages when benchmarked against contemporary state-of-the-art (SOTA) segmentation methodologies on the PSFH dataset, achieving a composite score of 0.9326, F1-score of 76.04, and ΔAoP of 7.70°. This advancement holds significant promise for automating fetal imaging analysis, potentially improving clinical consistency while reducing operator dependence. Our code is available at <span><span>https://github.com/AgamLi/GMANet-Gate-Mamba-Attention</span><svg><path></path></svg></span></div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105533"},"PeriodicalIF":3.0,"publicationDate":"2025-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144852075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}