{"title":"Multi-dimensional graph linear canonical transform","authors":"Na Li , Zhichao Zhang","doi":"10.1016/j.dsp.2025.105334","DOIUrl":"10.1016/j.dsp.2025.105334","url":null,"abstract":"<div><div>Multi-dimensional (M-D) graph signals represent fundamental data structures in various practical applications, including digital imaging systems, sensor network telemetry, and meteorological observation records. The development of efficient transformation methodologies for processing such M-D graph signals within the linear canonical transform domain continues to present significant challenges. This paper introduces two novel transforms: the two-dimensional graph linear canonical transform based on central discrete dilated Hermite functions (2-D CDDHFs-GLCT) and the two-dimensional graph linear canonical transform based on chirp multiplication-chirp convolution-chirp multiplication decomposition (2-D CM-CC-CM-GLCT). These foundational constructs are extended to the multi-dimensional domain, resulting in M-D CDDHFs-GLCT and M-D CM-CC-CM-GLCT. In terms of computational complexity, additivity and reversibility, M-D CDDHFs-GLCT and M-D CM-CC-CM-GLCT are compared. Theoretical analyses reveal that the M-D CM-CC-CM-GLCT algorithm achieves substantial computational complexity reduction compared to its CDDHFs-based counterpart. Numerical simulations further establish that while maintaining comparable additivity performance, the M-D CM-CC-CM-GLCT exhibits enhanced reversibility characteristics. To validate practical applicability, the proposed M-D GLCT framework is implemented in data compression scenarios. Experimental results demonstrating superior algorithmic efficiency and implementation effectiveness compared to multi-dimensional graph fractional Fourier transform.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"165 ","pages":"Article 105334"},"PeriodicalIF":2.9,"publicationDate":"2025-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144099315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DP-YOLO: A lightweight traffic sign detection model for small object detection","authors":"Ji Qiu , Wenbo Zhang , Siyuan Xu , Huiyu Zhou","doi":"10.1016/j.dsp.2025.105311","DOIUrl":"10.1016/j.dsp.2025.105311","url":null,"abstract":"<div><div>Autonomous driving is a critical area in artificial intelligence, with vast potential for development. While current object detection algorithms have shown strong performance in traffic sign detection, they still face difficulties with small object recognition, often resulting in missed or false detections. To address this, we propose DP-YOLO, a traffic sign detection algorithm based on YOLOv8s. To enhance detection accuracy for small objects and reduce the model's parameter count, we first removed the large object detection layer from the baseline model and added a small object detection layer. In the feature extraction stage, we design the DBBNCSPELAN4 module to boost the network's feature extraction capability. Additionally, we propose the PTCSP module, incorporating Transformer technology into the model's feature processing network and reducing both parameters and computational cost. Finally, we introduce the W3F_MPDIoU loss to mitigate the impact of low-quality samples on the model and enhance its robustness. Experiments demonstrate that, compared to YOLOv8s, DP-YOLO reduces the model's parameter count by 77.0%, while achieving improvements in mAP0.5 by 5.8% on the TT100K dataset, 2.7% on the GTSDB dataset, and 1.3% on the CCTSDB dataset. Experimental results demonstrate that the proposed method effectively enhances the detection capability for small-sized traffic signs and exhibits high potential for edge deployment.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"165 ","pages":"Article 105311"},"PeriodicalIF":2.9,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144069925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Contour detection network simulating the primary visual pathway","authors":"Ke Chen, Yingle Fan, Tao Fang","doi":"10.1016/j.dsp.2025.105308","DOIUrl":"10.1016/j.dsp.2025.105308","url":null,"abstract":"<div><div>Biological vision exhibits exceptional contour perception capabilities. In view of this, research on contour detection guided by biological vision is gradually gaining attention. Inspired by the transmission and processing of visual signals in the primary visual pathway, this study proposes a lightweight contour detection network called the Primary-Visual-Pathway UNet (PVP-UNet), comprising an encoder and a decoder. Primarily, inspiration from the binocular vision mechanism, we transmit the original image through dual deformable convolution modules that simulate the receptive fields of the left and right retinas in the encoder. Utilizing the characteristic of the optic chiasm, the output features of the retinal layers are split, swapped, and merged for further processing. Subsequently, dilated convolution modules and normal convolution modules are involved to simulate the magnocellular (M) and parvocellular (P) pathways of the lateral geniculate nucleus (LGN), respectively. An inhibition module was designed based on the suppression mechanism of classical/non-classical receptive fields (CRF/NCRF) in the primary visual cortex (V1). Additionally, the interconnection pattern of the inhibition modules is deployed by leveraging the aggregation characteristics of simple cells to complex cells in the V1 layer. Inspired by the feedback mechanism of visual information, the feature fusion module is introduced in the decoder to integrate features from different encoding layers in the reverse direction of signal transmission along the primary visual pathway. Our experiments show that the respective Optimal Dataset Scale (ODS) scores achieve 0.811, 0.756, and 0.896 on BSDS500, NYUD, and BIPED datasets. The experimental results demonstrate that the proposed network effectively suppresses background noise and highlights primary contours, exhibiting excellent detection performance on the tested datasets. The code is available at <span><span>https://github.com/k3chencoco/PVP-UNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"165 ","pages":"Article 105308"},"PeriodicalIF":2.9,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144090317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing medical image fusion through advanced decomposition and optimization methods","authors":"Phu-Hung Dinh","doi":"10.1016/j.dsp.2025.105315","DOIUrl":"10.1016/j.dsp.2025.105315","url":null,"abstract":"<div><div>The fusion of medical images plays a pivotal role in advancing clinical diagnostics by integrating complementary information from diverse imaging modalities. However, existing image fusion algorithms encounter several limitations that reduce their effectiveness. Key challenges include low-quality input images, inefficient decomposition techniques, and fusion rules that lack optimal design. In this study, we propose a novel adaptive image fusion framework that introduces three key advancements. First, we introduce an image enhancement method and utilize its optimized parameters to develop the Adaptive Three-Component Image Decomposition (ATCID) approach. Next, this decomposition method is applied to decompose the input into three components: a brightness-enhanced component, a high-contrast & low-noise component, and an edge-detail component. Second, we introduce an adaptive fusion method, AFM_EOA, based on the Equilibrium Optimization Algorithm (EOA) to effectively integrate base components, resulting in enhanced image quality. Third, a fusion method utilizing Multi-Feature Local Energy (MFLE) is developed for detail components to enhance fine details while minimizing information loss. Comprehensive experimental evaluations on multiple datasets demonstrate that the proposed method surpasses state-of-the-art fusion techniques across various objective metrics, such as contrast index (CI), mutual information (MI), and edge preservation (<span><math><msup><mrow><mi>Q</mi></mrow><mrow><mi>A</mi><mi>B</mi><mo>/</mo><mi>F</mi></mrow></msup></math></span>). The proposed method provides a scalable and effective solution for medical imaging, ensuring superior quality and structural consistency in fused images.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"165 ","pages":"Article 105315"},"PeriodicalIF":2.9,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144069923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ana Cinta Oria Oria , Juan A. Becerra , María J. Madero-Ayora , Vicente Baena Lecuyer , Carlos Crespo-Cadenas
{"title":"Symbol constellation predistortion for DCO-OFDM visible light communications system linearization","authors":"Ana Cinta Oria Oria , Juan A. Becerra , María J. Madero-Ayora , Vicente Baena Lecuyer , Carlos Crespo-Cadenas","doi":"10.1016/j.dsp.2025.105310","DOIUrl":"10.1016/j.dsp.2025.105310","url":null,"abstract":"<div><div>Visible Light Communication (VLC) systems face significant challenges due to the inherent nonlinear characteristics of light emitting diodes (LEDs), which cause distortion in the transmitted DCO-OFDM signals. This signal quality degradation can be lower forcing a large input power back-off (IBO), but leading to an inefficient use of the LEDs. In this work, we propose a predistortion technique, referred to as constellation predistorter (CPD), based on a novel frequency-domain algorithm for the linearization of OFDM signals in VLC systems. The CPD operates on the signal constellation and is based on applying a Bayesian pursuit to obtain a sparse memory polynomial (MP) model matrix. For comparison purposes, this method has been compared to the MP-based time-domain digital predistorter (DPD). The linearization performance of the CPD is measured in terms of the error vector magnitude (EVM) and illumination-to-communication conversion efficiency (ICE) parameters. With the proposed predistorter, we achieve a significant IBO reduction as large as 7.6 dB, enhancing the efficiency of VLC systems, or a nearly 62% decrease in the EVM for a fixed IBO, which represents a substantial reduction in signal distortion and an improvement in ICE.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"165 ","pages":"Article 105310"},"PeriodicalIF":2.9,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143948874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An accelerated unsupervised denoising model based on risk estimation and sparsity variational techniques","authors":"Cheng Zhang, Kin Sam Yen","doi":"10.1016/j.dsp.2025.105317","DOIUrl":"10.1016/j.dsp.2025.105317","url":null,"abstract":"<div><div>The Deep Image Prior (DIP) method has limitations in denoising performance, slow convergence, and susceptibility to overfitting. We propose an enhanced model to address these issues by utilizing Stein's Unbiased Risk Estimate (SURE) loss for improved denoising capabilities. Additionally, we incorporate the Total Generalized Variation (TGV) regularization to mitigate overfitting. To overcome the problem of blurred edges caused by TGV, we introduce the Overlapping Group Sparsity (OGS) method into the first-order gradient constraint of the explicit TGV regularization term. Our proposed model employs a fast Split Bregman method to decompose the optimization problem, solving subproblems alternately using the Fast Fourier Transform (FFT) and Majorization-Minimization (MM) techniques. While most DIP variants use random noise images as network input, our model utilizes observed noisy images instead. By integrating SURE and explicit regularization terms, we effectively mitigate performance degradation and severe overfitting associated with the input. Moreover, our approach retains the significant advantage of achieving breakneck convergence speed. In contrast to DIP, our method demonstrates substantial improvements when dealing with Gaussian noise. Specifically, the peak signal-to-noise ratio (PSNR) shows an average increase of 7.0 %, and the structural similarity index measure (SSIM) exhibits an average improvement of 5.5 %. Our proposed model also achieves a peak iteration number that is one-tenth of that in DIP and its variants. Furthermore, our proposed model demonstrates superior performance compared to state of the art supervised learning based convolutional neural networks such as Denoising Convolutional Neural Network (DnCNN).</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"165 ","pages":"Article 105317"},"PeriodicalIF":2.9,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144090315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhaoyi Ye , Jin Huang , Yimin Zhang , Jingwen Deng , Jingwen Zhang , Sheng Liu , Du Wang , Liye Mei , Cheng Lei
{"title":"DFLNet: Disentangled feature learning network for breast cancer ultrasound image segmentation","authors":"Zhaoyi Ye , Jin Huang , Yimin Zhang , Jingwen Deng , Jingwen Zhang , Sheng Liu , Du Wang , Liye Mei , Cheng Lei","doi":"10.1016/j.dsp.2025.105331","DOIUrl":"10.1016/j.dsp.2025.105331","url":null,"abstract":"<div><div>Breast cancer stands as one of the most prevalent malignancies among women, underscoring the critical importance of accurate detection for effective treatment. Ultrasound imaging is commonly employed for screening and measurement due to its accessibility and broad applicability. However, the diverse morphology of lesions and the severe interference from pseudo-lesions in ultrasound images make it challenging to accurately segment lesion regions. Therefore, we propose a disentangled feature learning network (DFLNet) to better separate and capture the multi-level features of breast cancer. Specifically, the disentangled learning encoder utilizes multiple parallel ConvNext blocks to extract both low-level texture and deep semantic features. Through reversible connections, it facilitates seamless transmission of multi-level features, thereby achieving effective feature disentanglement and preservation. Then, we design a spatial cross-attention module that stacks spatial attention and cross-attention residually to enhance multi-scale feature representations, aiming to mitigate pseudo-lesion interference and improve lesion feature discrimination. Additionally, we collect clinical ultrasound image data from Renmin Hospital of Wuhan University to construct the BUSI-WHU dataset, providing a valuable benchmark for ultrasound image segmentation. Extensive experimental results on the Dataset-B, BUSI, BUSI-WHU, and TN3k datasets demonstrate that DFLNet outperforms twelve methods across multiple evaluation metrics, highlighting its effectiveness in assisting clinical diagnosis.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"165 ","pages":"Article 105331"},"PeriodicalIF":2.9,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144084010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Long Chen , Huihuang Zhao , Jiaxin He , Weiliang Meng
{"title":"GLFAFormer: DeepFake forgery detection with adaptive feature extract and align","authors":"Long Chen , Huihuang Zhao , Jiaxin He , Weiliang Meng","doi":"10.1016/j.dsp.2025.105321","DOIUrl":"10.1016/j.dsp.2025.105321","url":null,"abstract":"<div><div>With the rapid development of generative models, there is an increasing demand for universal fake image detectors. In this paper, we investigate the problem of fake image detection for the synthesis of generative models to detect fake images from multiple generative methods. Recent research methods explore the benefits of pre-trained models and mainly adopt a fixed paradigm of training additional classifiers separately, but we find that the fixed paradigm hinders the full learning of forgery features, leading to insufficient representation learning in the detector. The main reason is that the fixed paradigm pays too much attention to global features and neglects local features, which limits the ability of the model to perceive image details and leads to some information loss or confusion. In this regard, based on the pre-trained visual language space, our method introduces two core designs. First, we designed a Deep Window Triple Attention (DWTA) module. A similar dense sliding window strategy is adopted to capture multi-scale local abnormal patterns, and the sensitivity to generated artifacts is enhanced through the triple attention mechanism. Secondly, we proposed a Cross-Space Feature Alignment (CSFA) module. A two-way interactive channel between global features and local features is established, and the alignment loss function is used to achieve semantic alignment of cross-modal feature spaces. The aligned features are adaptively fused through a gating mechanism to obtain the final adaptive forged features. Experiments demonstrate that our method, when trained solely on ProGAN data, achieves superior cross-generator generalization: it attains an average accuracy of 94.7% on unseen GANs and generalizes to unseen diffusion models with 94% accuracy, surpassing existing methods by 2.1%. The source code is available at <span><span>https://github.com/long2580h/GLFAFormer</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"165 ","pages":"Article 105321"},"PeriodicalIF":2.9,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144069926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Salient object detection in optical remote sensing images based on hybrid edge fusion perception","authors":"Xingyu Dong , Jiangtao Wang , Baoxin Dong","doi":"10.1016/j.dsp.2025.105332","DOIUrl":"10.1016/j.dsp.2025.105332","url":null,"abstract":"<div><div>Fully extracting and leveraging edge features to refine the structural integrity of salient regions remains a significant challenge in optical remote sensing images (ORSIs) analysis. In this study, we present a novel and effective Hybrid Edge Fusion Perception Network (HEFPNet), aimed at improving the model's ability to detect salient objects. Our network utilizes a Transformer-based backbone architecture to generate four levels of feature representations that capture global long-range dependencies. Subsequently, we introduce an Edge Feature Enhancement Fusion Module (EFEFM), which enhances the structure and critical regions of the target by extracting and fusing edge information from both feature maps and attention maps. Furthermore, to accurately localize objects, we propose a Multi-Scale Directional Perception Module (MSDPM). Finally, a Deformable Dilated Convolution Decoder (DCD) is designed, which combines the advantages of deformable and dilated convolutions, adaptively capturing key features. Comprehensive experiments performed on three publicly accessible datasets—ORSSD, EORSSD, and ORSI-4199—demonstrate that the proposed network achieves better performance compared to existing state-of-the-art approaches.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"165 ","pages":"Article 105332"},"PeriodicalIF":2.9,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144123807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Decoding audio related region with temporal and structural enhancement for audio visual segmentation","authors":"Qingwei Geng, Xiaodong Gu","doi":"10.1016/j.dsp.2025.105302","DOIUrl":"10.1016/j.dsp.2025.105302","url":null,"abstract":"<div><div>Audio-Visual Segmentation (AVS) is a task that aims to predict pixel-level masks for sound-producing objects in videos. Recent advanced AVS methods primarily focus on cross-modal interaction while often neglecting the significance of temporal modeling and precise structural prediction. To address these challenges, we propose a novel AVS framework incorporating several innovations. Firstly, we propose a Temporal Enhancement Module (TEM) that effectively captures temporal relationships across frames. Secondly, we devise an Audio-Visual Decoder that utilizes audio information to selectively emphasize relevant visual regions during decoding. Besides, Structural Similarity (SSIM) is introduced into the loss function to preserve the structural integrity of predicted masks, thereby enhancing the coherence and precision of object boundaries. The extensive experimental results on multiple AVS datasets show that our proposed method outperforms current advanced AVS models and approaches from other tasks in terms of the mean Intersection over Union (mIoU) and F-score metrics.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"165 ","pages":"Article 105302"},"PeriodicalIF":2.9,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144084011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}