Digital Signal Processing最新文献

筛选
英文 中文
An algorithm for multi-directional text detection in natural scenes 自然场景中多向文本检测算法
IF 2.9 3区 工程技术
Digital Signal Processing Pub Date : 2025-07-17 DOI: 10.1016/j.dsp.2025.105482
Dapeng Wan , Lixia Deng , Jinshun Dong , Meiqi Guo , Jianqin Yin , Chenxu Liu , Haiying Liu
{"title":"An algorithm for multi-directional text detection in natural scenes","authors":"Dapeng Wan ,&nbsp;Lixia Deng ,&nbsp;Jinshun Dong ,&nbsp;Meiqi Guo ,&nbsp;Jianqin Yin ,&nbsp;Chenxu Liu ,&nbsp;Haiying Liu","doi":"10.1016/j.dsp.2025.105482","DOIUrl":"10.1016/j.dsp.2025.105482","url":null,"abstract":"<div><div>Due to factors such as background interference and scale variations, the text detection task in natural scenes is faced with challenges, especially in applications like autonomous driving and image understanding, where higher requirements are imposed on detection accuracy and efficiency. Under this background, the development of efficient detection algorithms specifically for natural-scene text is of particular importance. To this end, the Multi-Directional Text You Only Look Once (MDT-YOLO) is proposed in this paper. Firstly, a Dual Path Residual Connection (DPRC) block is designed, which enhances the model's multi-scale feature perception ability and alleviates the problem of missed detections caused by scale variations. Secondly, to reduce text information loss during downsampling, the Depthwise Separable Strided Downsampling (DSSDown) module is proposed, improving the model's ability to recognize fine - grained text regions. Additionally, an Efficient Down-Transition (EDT) module is constructed to reconstruct the Backbone network, achieving a coordinated improvement in semantic modeling and computational efficiency. Experimental results show that compared with the baseline model, the parameter count of MDT-YOLO is reduced by 29.4% while the processing speed remains basically the same. Meanwhile, on the MSRA-TD500 dataset, Precision and [email protected] are improved by 3.2% and 2.8% respectively, and on the HUST-TR400 dataset, they are improved by 1.7% and 1.3% respectively. The code will be available at <span><span>https://github.com/WDP-0806/MDT-YOLO</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105482"},"PeriodicalIF":2.9,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144654517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep tensor completion graph convolutional subspace clustering 深度张量补全图卷积子空间聚类
IF 2.9 3区 工程技术
Digital Signal Processing Pub Date : 2025-07-15 DOI: 10.1016/j.dsp.2025.105478
Chunzhu Xie , Jun Kong , Min Jiang , Xuefeng Tao
{"title":"Deep tensor completion graph convolutional subspace clustering","authors":"Chunzhu Xie ,&nbsp;Jun Kong ,&nbsp;Min Jiang ,&nbsp;Xuefeng Tao","doi":"10.1016/j.dsp.2025.105478","DOIUrl":"10.1016/j.dsp.2025.105478","url":null,"abstract":"<div><div>Graph Convolutional Subspace Clustering (GCSC) aims to integrate the topological information of data with subspace representations by Graph Convolutional Networks (GCNs). However, existing methods are limited by their emphasis on local topological information, which neglects global relationships in data. Also, their adjacency matrices are fixed and predefined, which fail to adjust to the changing features during training and may be easily affected by noise. To address these issues, we propose Deep Tensor Completion Graph Convolutional Subspace Clustering (DTC-GCSC). Firstly, we treat the initialized adjacency matrix as a trainable parameter, enabling its joint optimization with the model through a deep architecture. Based on this framework, we further incorporate global topological information by integrating conventional subspace clustering (CSC) into GCSC, extending local relationships to a global structure. Finally, to enhance the consistency between local and global information, we introduce a Tensor Nuclear Norm (TNN) constraint to enforce high-order correlations across them. Extensive experiments on multiple datasets demonstrate the superiority of our method over state-of-the-art approaches.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105478"},"PeriodicalIF":2.9,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144632907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-information-aware speech enhancement through self-supervised learning 基于自监督学习的多信息感知语音增强
IF 2.9 3区 工程技术
Digital Signal Processing Pub Date : 2025-07-15 DOI: 10.1016/j.dsp.2025.105464
Xiaotong Tu , Jiaxin Xie , Yijin Mao , Yue Huang , Xinghao Ding , Shaogan Ye
{"title":"Multi-information-aware speech enhancement through self-supervised learning","authors":"Xiaotong Tu ,&nbsp;Jiaxin Xie ,&nbsp;Yijin Mao ,&nbsp;Yue Huang ,&nbsp;Xinghao Ding ,&nbsp;Shaogan Ye","doi":"10.1016/j.dsp.2025.105464","DOIUrl":"10.1016/j.dsp.2025.105464","url":null,"abstract":"<div><div>Speech enhancement is a crucial technology aimed at improving the quality and intelligibility of speech signals in noisy environments. Recent advancements in deep neural networks have leveraged abundant clean speech datasets for supervised learning with remarkable results. However, supervised models suffer from poor robustness and generalization due to the scarcity of clean speech data and the complexity of the noise distribution in the real world. In this paper, a self-supervised speech enhancement model, called Multi-Information-Aware Speech Enhancement (MIA-SE), is proposed to address these challenges. A novel self-supervised training strategy is introduced in which denoising is performed on a single input twice, with the first denoiser output being employed as an Implicit Deep Denoiser Prior (IDDP) to supervise the subsequent denoising process. Furthermore, an encoder–decoder denoiser architecture based on a complex ratio masking strategy is incorporated to extract phase and magnitude features simultaneously. To capture sequence context information for improved embedding, transformer modules with multi-head attention mechanisms are integrated within the denoiser. The training process is guided by a newly formulated loss function to ensure successful and effective learning. Experimental results on synthetic and real-world noise databases demonstrate the effectiveness of MIA-SE, particularly in scenarios where paired training data is unavailable.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105464"},"PeriodicalIF":2.9,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144632905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-histogram equalization for image enhancement using adaptive fuzzy clustering and optimized clipping 多直方图均衡化图像增强使用自适应模糊聚类和优化裁剪
IF 2.9 3区 工程技术
Digital Signal Processing Pub Date : 2025-07-15 DOI: 10.1016/j.dsp.2025.105466
Chunmeng Li , Chenyang Zhang , Ziyun Liu , Xiaozhong Yang
{"title":"Multi-histogram equalization for image enhancement using adaptive fuzzy clustering and optimized clipping","authors":"Chunmeng Li ,&nbsp;Chenyang Zhang ,&nbsp;Ziyun Liu ,&nbsp;Xiaozhong Yang","doi":"10.1016/j.dsp.2025.105466","DOIUrl":"10.1016/j.dsp.2025.105466","url":null,"abstract":"<div><div>Image enhancement plays a crucial role in medical imaging and engineering by highlighting details and key regions, thereby improving analytical and diagnostic accuracy. Histogram equalization (HE) is one of the most widely used techniques for image enhancement. However, traditional HE methods lack adaptability to varying brightness regions and often introduce local distortions and artifacts. To address these issues, this paper proposes a multi-histogram equalization algorithm based on adaptive fuzzy clustering and optimized clipping. First, a histogram density analysis method is employed to automatically detect peaks, and the fuzzy C-means (FCM) clustering algorithm is used to adaptively segment image brightness, achieving intelligent histogram partitioning. Then, an optimized clipping and redistribution strategy is designed for each sub-histogram, where a redistribution parameter is introduced to balance enhancement and detail preservation, effectively suppressing over-enhancement. Finally, the dynamic range of each sub-image is adjusted based on the original grayscale distribution and pixel proportion, followed by independent equalization. Experimental results demonstrate that the proposed method achieves superior enhancement across diverse brightness conditions and scenes, outperforming ten state-of-the-art HE algorithms in both visual quality and quantitative metrics.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105466"},"PeriodicalIF":2.9,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144632901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A small target detection algorithm for unmanned aerial vehicles incorporating global information modeling and multi-scale feature interaction 基于全局信息建模和多尺度特征交互的无人机小目标检测算法
IF 2.9 3区 工程技术
Digital Signal Processing Pub Date : 2025-07-15 DOI: 10.1016/j.dsp.2025.105465
Wencong Liu , Yehang Li , Shunsong Huang , Qing Yu
{"title":"A small target detection algorithm for unmanned aerial vehicles incorporating global information modeling and multi-scale feature interaction","authors":"Wencong Liu ,&nbsp;Yehang Li ,&nbsp;Shunsong Huang ,&nbsp;Qing Yu","doi":"10.1016/j.dsp.2025.105465","DOIUrl":"10.1016/j.dsp.2025.105465","url":null,"abstract":"<div><div>Aerial image object detection faces critical challenges due to target scale variation, high background complexity, and the vulnerability of small objects to noise. Existing methods remain limited in global context modeling and cross-scale feature interactions. To address these issues, we propose GM-YOLO, a novel small object detection framework that integrates global semantic modeling with dynamic multi-scale feature fusion. First, the CFC3K2 module synergizes convolutional neural networks (CNNs) and Transformers, leveraging depthwise separable convolutions and multilayer perceptrons (MLPs) to enhance local detail retention and mitigate feature dilution in small objects. Second, the SPPF-LSKA module employs large-kernel separable convolutions and dilated convolutions to optimize multi-scale feature fusion and global response capability. Third, the BiFPN-SDI architecture improves cross-level feature interaction efficiency through nonlinear multiplicative fusion and dynamic scale alignment. Additionally, the Shared Detail Enhancement Detection Head (SDEDH) reduces parameter redundancy via group normalization and parameter sharing while strengthening edge feature extraction. Finally, the SlideLoss function dynamically modulates gradients to alleviate sample imbalance. Experiments demonstrate that GM-YOLO achieves mAP@50 and mAP@50-95 scores of 43.8% and 27.0% on VisDrone2019, outperforming YOLOv11s by 4.7% and 3.6% respectively, with a 16% parameter reduction (7.9 million). Generalization tests on DOTAv2 further validate its robustness, achieving a 5.3% improvement in mAP@50. GM-YOLO surpasses mainstream detectors in both accuracy and efficiency for complex aerial scenarios.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105465"},"PeriodicalIF":2.9,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144632904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An evaluation and study of detail contrast preservation and color consistency in decolorization 脱色中细节对比度保存和颜色一致性的评价与研究
IF 2.9 3区 工程技术
Digital Signal Processing Pub Date : 2025-07-15 DOI: 10.1016/j.dsp.2025.105468
Shenglin Peng , Tong Gao , Shuyi Qu , Zhe Yu , Jun Wang , Jinye Peng
{"title":"An evaluation and study of detail contrast preservation and color consistency in decolorization","authors":"Shenglin Peng ,&nbsp;Tong Gao ,&nbsp;Shuyi Qu ,&nbsp;Zhe Yu ,&nbsp;Jun Wang ,&nbsp;Jinye Peng","doi":"10.1016/j.dsp.2025.105468","DOIUrl":"10.1016/j.dsp.2025.105468","url":null,"abstract":"<div><div>Grayscale conversion plays a crucial role in image processing, particularly for edge detection and segmentation tasks, where decolorization quality directly impacts subsequent analysis. An ideal decolorization algorithm should be both efficient and robust while preserving color consistency and detail contrast. In this study, we revisit the RTCP (Real-time Contrast-Preserving Decolorization) algorithm and propose three key optimizations: a clustering-guided decolorization approach, a locally adaptive decolorization strategy, and a weight-optimized decolorization method. To enhance solution quality, we implement a constrained particle swarm optimization framework to systematically explore the parameter space. Experimental validation on two standard datasets (Ĉadík and CSDD) demonstrates that our optimized methods handle diverse decolorization scenarios more effectively while maintaining competitive performance against existing approaches. Recognizing the limitations of current evaluation metrics in assessing detail contrast preservation, we introduce the D-C2G-SSIM metric for more accurate quantitative assessment. Comparative results show consistent improvements over the original RTCP algorithm, with the average D-C2G-SSIM score increasing from 0.8331 to 0.8442 on Ĉadík dataset and from 0.8696 to 0.8847 on the CSDD dataset, confirming the effectiveness of our approach.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105468"},"PeriodicalIF":2.9,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144654626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cross-modal information interaction of binocular predictive networks for RGBT tracking 双目预测网络在rbt跟踪中的跨模态信息交互
IF 2.9 3区 工程技术
Digital Signal Processing Pub Date : 2025-07-14 DOI: 10.1016/j.dsp.2025.105473
Jianming Chen , Dingjian Li , Xiangjin Zeng , Yaman Jing , Zhenbo Ren , Jianglei Di , Yuwen Qin
{"title":"Cross-modal information interaction of binocular predictive networks for RGBT tracking","authors":"Jianming Chen ,&nbsp;Dingjian Li ,&nbsp;Xiangjin Zeng ,&nbsp;Yaman Jing ,&nbsp;Zhenbo Ren ,&nbsp;Jianglei Di ,&nbsp;Yuwen Qin","doi":"10.1016/j.dsp.2025.105473","DOIUrl":"10.1016/j.dsp.2025.105473","url":null,"abstract":"<div><div>RGBT tracking aims to aggregate the information from both visible and thermal infrared modalities to achieve visual object tracking. Although plenty of RGBT tracking methods have been proposed, they usually lead to target loss or tracking drift due to the inability to effectively extract useful feature information contained in the multimodal information. To handle this problem, we propose a cross-modal information interaction binocular prediction network. Firstly, a deep, multi-branch feature extraction network is constructed based on Siamese networks to fully exploit the semantic features of images from different optical modalities. The designed image feature enhancement modules are utilized to effectively capture and enhance object features, thereby improving tracking performance. Secondly, a fusion scheme is developed to achieve bidirectional fusion of multimodal features, leveraging complementary cross-modal information to retain distinguishable object characteristics across different modalities. Finally, the anchor-free concept is introduced into the RGBT object tracking domain and combined with a Peak Adaptive Selection (PAS) module to design a binocular prediction network, making the tracker more flexible and versatile. Evaluation experiments conducted on three standard RGBT tracking datasets, namely GTOT, RGBT234, and LasHeR, demonstrate that the modifications made to the baseline Siamese network architecture are effective. The proposed tracker is competitive with existing state-of-the-art (SOTA) methods, achieving comparable results in terms of precision and success rate. The key advantage of the proposed method lies in the robust fusion of multimodal features and the flexibility introduced by the anchor-free prediction design, which contribute to the stability of the proposed tracker across various scenarios. Code is released at <span><span>https://github.com/JMChenl/RGBT-tracking.git</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105473"},"PeriodicalIF":2.9,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144654625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Wavelet-based multi-level information compensation learning for visible-infrared person re-identification 基于小波的多层次信息补偿学习的可见红外人再识别
IF 2.9 3区 工程技术
Digital Signal Processing Pub Date : 2025-07-14 DOI: 10.1016/j.dsp.2025.105471
Haobiao Fan , Yanbing Chen , Yibo Chen , Zhixin Tie , Hao Sheng , Wei Ke
{"title":"Wavelet-based multi-level information compensation learning for visible-infrared person re-identification","authors":"Haobiao Fan ,&nbsp;Yanbing Chen ,&nbsp;Yibo Chen ,&nbsp;Zhixin Tie ,&nbsp;Hao Sheng ,&nbsp;Wei Ke","doi":"10.1016/j.dsp.2025.105471","DOIUrl":"10.1016/j.dsp.2025.105471","url":null,"abstract":"<div><div>The main challenge in cross-modal person re-identification (VI-ReID) is extracting discriminative features from different modalities. Most existing methods focus on minimizing modal differences but overlook the shallow modality-invariant information lost as network depth increases. To address this, we propose the Wavelet-based Multi-level Information Compensation (WMIC) learning method. At multiple network stages, we design an Information Compensation Block (ICB) that applies wavelet decomposition to deep features, producing four wavelet subbands to preserve modality-invariant details and enlarge the receptive field. These subbands are used to compute an attention matrix with shallow features, which is then applied to enhance shallow features' local information. Additionally, we represent each person image with two sets of embeddings by introducing a Wavelet Enhancement Block (WEB) to generate an additional embedding. Finally, we use a dual-branch center-guided loss to make the two embeddings complementary, thereby reducing the disparity between infrared and visible images. Extensive experiments on the SYSU-MM01, RegDB, and LLCM datasets demonstrate that WMIC outperforms existing methods.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105471"},"PeriodicalIF":2.9,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144632906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A unified estimation method for target number and angle-range parameters in FDA-MIMO radar under impulse noise environments 脉冲噪声环境下FDA-MIMO雷达目标数和角距参数的统一估计方法
IF 2.9 3区 工程技术
Digital Signal Processing Pub Date : 2025-07-14 DOI: 10.1016/j.dsp.2025.105463
Menghan Chen, Hongyuan Gao, Yapeng Liu, Helin Sun
{"title":"A unified estimation method for target number and angle-range parameters in FDA-MIMO radar under impulse noise environments","authors":"Menghan Chen,&nbsp;Hongyuan Gao,&nbsp;Yapeng Liu,&nbsp;Helin Sun","doi":"10.1016/j.dsp.2025.105463","DOIUrl":"10.1016/j.dsp.2025.105463","url":null,"abstract":"<div><div>In the presence of impulse noise, estimating both the target number and angle-range parameters of frequency diverse array multiple-input multiple-output (FDA-MIMO) radars is a recognized challenge. In this paper, a novel unified method is introduced for simultaneously estimating both the angle-range parameter and target number of FDA-MIMO radars under impulse noise. To mitigate the effects of impulse noise, a novel adaptive low-order covariance (ALC) method that requires no prior information is proposed. Additionally, a two-dimensional spatial spectrum function is derived using the ALC-MVDR method, leveraging the ability of the minimum-variance distortionless response (MVDR) beamformer to provide a two-dimensional spatial spectrum when the target number is unknown. To resolve the two-dimensional spatial spectrum function, the multimodal quantum sunflower optimization algorithm (MQSFOA) is introduced, which can effectively identify the number of spectral peaks and estimate the peak location without quantization errors. The interrelated Cramér–Rao bound (CRB) is then deduced for evaluating the developed method under conditions of impulse noise. Comparative simulation studies demonstrate the significant performance improvements of the presented method.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105463"},"PeriodicalIF":2.9,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144654627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Kernel Gaussian processes based extended target tracking in polar coordinate 基于核高斯过程的极坐标扩展目标跟踪
IF 2.9 3区 工程技术
Digital Signal Processing Pub Date : 2025-07-14 DOI: 10.1016/j.dsp.2025.105462
Dongsheng Yang , Yunfei Guo , Hoseok Sul , Jee Woong Choi , Taek Lyul Song
{"title":"Kernel Gaussian processes based extended target tracking in polar coordinate","authors":"Dongsheng Yang ,&nbsp;Yunfei Guo ,&nbsp;Hoseok Sul ,&nbsp;Jee Woong Choi ,&nbsp;Taek Lyul Song","doi":"10.1016/j.dsp.2025.105462","DOIUrl":"10.1016/j.dsp.2025.105462","url":null,"abstract":"<div><div>Most of the traditional extended target tracking (ETT) methods struggle with the strong nonlinearity introduced by the polar coordinate measurements and the unknown maneuvering motion model. These factors either lead to high approximation errors or impose a high computational cost, making accurate and efficient tracking challenging. To address these problems, a kernel Gaussian process-based extended target tracking (KGP-ETT) algorithm is proposed. First, the kernel mean embedding (KME) algorithm embeds the posterior distribution into a high-dimensional reproducing kernel Hilbert space (RKHS) and propagates the state particles through the nonlinear motion model, thereby effectively capturing the inherent nonlinearity. Second, based on the KME method, a kernel-based measurement update is proposed to estimate the target state in a linearized manner by integrating kernel techniques into the Gaussian process (GP) framework. Finally, the computational complexity and the theoretical posterior Cramér-Rao lower bound (PCRLB) of the proposed algorithm are analyzed. Simulation and real-world experiments demonstrate that, during target maneuvering, KGP-ETT achieves up to 77% reduction in centroid root mean square error (RMSE), 64% reduction in extent RMSE, and a 148% improvement in intersection of union (IoU) compared to state-of-the-art GP and Variational Bayesian (VB) methods. These results highlight the robustness and accuracy of KGP-ETT in handling complex nonlinear ETT problems in polar coordinates.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105462"},"PeriodicalIF":2.9,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144632902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信