Signal ProcessingPub Date : 2025-05-06DOI: 10.1016/j.sigpro.2025.110075
Vrushali Pagire, Murthy Chavali , Ashish Kale
{"title":"A comprehensive review of object detection with traditional and deep learning methods","authors":"Vrushali Pagire, Murthy Chavali , Ashish Kale","doi":"10.1016/j.sigpro.2025.110075","DOIUrl":"10.1016/j.sigpro.2025.110075","url":null,"abstract":"<div><div>Object detection is one of the most important and challenging tasks of computer vision. It has numerous applications in the fields of agriculture, defence, retail markets and manufacturing units, transportation, social media platforms, medical, wildlife monitoring and conservation. This survey aims to give researchers a comprehensive understanding of the current state of object detection algorithms. In this review, object detection and its different aspects have been covered in detail. This review paper starts with a quick overview of object detection followed by traditional and deep learning models for object detection. The section on deep learning models provides a comprehensive overview of one-stage and two-stage object detectors. A detailed discussion is given of the transformer-based detectors and lightweight networks category. Additionally, the evaluation metrics used for object detection methods are discussed systematically. The best object detection algorithms for different applications are discussed at the end of the survey. This survey is useful for beginners who want to study different object detection algorithms and their use in different applications.</div></div>","PeriodicalId":49523,"journal":{"name":"Signal Processing","volume":"237 ","pages":"Article 110075"},"PeriodicalIF":3.4,"publicationDate":"2025-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143913014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Signal ProcessingPub Date : 2025-05-06DOI: 10.1016/j.sigpro.2025.110073
Qilei Li , Wenhao Song , Mingliang Gao , Wenzhe Zhai , Qiang Zhou , Zhao Huang
{"title":"Towards text-refereed multi-modal image fusion by cross-modality interaction","authors":"Qilei Li , Wenhao Song , Mingliang Gao , Wenzhe Zhai , Qiang Zhou , Zhao Huang","doi":"10.1016/j.sigpro.2025.110073","DOIUrl":"10.1016/j.sigpro.2025.110073","url":null,"abstract":"<div><div>Multi-modal image fusion aims to generate a fused image that possesses the advantage of the source images in different modalities. The fused image is capable of significantly facilitating high-level vision tasks, <em>e.g.,</em> image segmentation and object detection. However, most existing fusion methods generally focus on preserving the structure and detailed representation of the fused images while failing to integrate the high-level semantic information in the source images. To address this problem, we propose a text-guided multi-modal image fusion framework, termed Cross-Modality Interaction (CMI)-Fusion. The proposed model leverages the robust capabilities of a large-scale foundation model, <em>i.e.,</em> Contrastive Language–Image Pre-training (CLIP), to achieve efficient interaction between image detail and text prompts. Specifically, a Dual Attention Feature Extraction (DAFE) module is derived to extract representative visual and semantic features. Moreover, a cross-modality Image-Text Interaction (ITI) module is derived to achieve a dynamic interaction between the image and corresponding text features. Extensive experiments on various multi-modal datasets demonstrate that the proposed CMI-Fusion retains image structural details and semantic content compared to the state-of-the-art methods. The code is available at <span><span>https://github.com/songwenhao123/CMI-Fusion</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49523,"journal":{"name":"Signal Processing","volume":"237 ","pages":"Article 110073"},"PeriodicalIF":3.4,"publicationDate":"2025-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143917348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Signal ProcessingPub Date : 2025-05-05DOI: 10.1016/j.sigpro.2025.110071
Bin Dong, Qianqian Bu, Zicong Zhu, Jingen Ni
{"title":"An active contour model with adaptive weighted mean filtering and anisotropic diffusion filtering","authors":"Bin Dong, Qianqian Bu, Zicong Zhu, Jingen Ni","doi":"10.1016/j.sigpro.2025.110071","DOIUrl":"10.1016/j.sigpro.2025.110071","url":null,"abstract":"<div><div>Active contour models (ACMs) are classic and effective methods for image segmentation. However, most existing ACMs cannot obtain satisfactory accuracy for segmenting images with inhomogeneous intensity caused by uneven illumination and noise. To address this issue, this work proposes an ACM with adaptive weighted mean filtering and anisotropic diffusion filtering (AWMAD). First, we propose a logarithmic-subtraction mechanism to decompose uneven illumination and noise of the original image, which can separate the high-frequency component with edge features from the original image. Then, we propose a fitted function based on adaptive weighted mean filtering to calculate the low frequency component, which suppresses the influence of uneven illumination on extraction of edge features. Moreover, we present a novel data driven term using anisotropic diffusion filtering and image entropy to suppress noise and enhance edge features. Finally, we construct an ACM based on our logarithmic-subtraction mechanism with the fitted function and data driven term, which eliminates convolution operations during the level set function iterations and accelerates the segmentation speed of AWMAD. Experimental results demonstrate that AWMAD has a high segmentation accuracy for segmenting images with uneven illumination and noise.</div></div>","PeriodicalId":49523,"journal":{"name":"Signal Processing","volume":"237 ","pages":"Article 110071"},"PeriodicalIF":3.4,"publicationDate":"2025-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143921911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Signal ProcessingPub Date : 2025-05-05DOI: 10.1016/j.sigpro.2025.110060
Oscar G. Ibarra-Manzano, José A. Andrade-Lucio, Miguel A. Vázquez-Olguín, Yuriy S. Shmaliy
{"title":"Transfer function-based robust filtering: Review and critical evaluation","authors":"Oscar G. Ibarra-Manzano, José A. Andrade-Lucio, Miguel A. Vázquez-Olguín, Yuriy S. Shmaliy","doi":"10.1016/j.sigpro.2025.110060","DOIUrl":"10.1016/j.sigpro.2025.110060","url":null,"abstract":"<div><div>Promoted by Wilson in his 1989 year work through the convolution and Hankel operator norms, the transfer function approach (TFA) developed by many authors has earlier emerged as a novel trend of sorts in robust estimation of system state to minimize the estimation error bounded norm for the maximized error bounded norm. This paper takes a fresh look at the problem through the bias correction gain <span><math><mi>K</mi></math></span> of a recursive filter, reviews and revisits the existing robust <span><math><msub><mrow><mi>H</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>, energy-to-energy or <span><math><msub><mrow><mi>H</mi></mrow><mrow><mi>∞</mi></mrow></msub></math></span>, energy-to-peak or generalized <span><math><msub><mrow><mi>H</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span> (G<span><math><msub><mrow><mi>H</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>), and peak-to-peak or <span><math><msub><mrow><mi>L</mi></mrow><mrow><mn>1</mn></mrow></msub></math></span> filtering solutions, and critically evaluates their performances. It is shown that the effective <span><math><mi>K</mi></math></span> ranges between the larger gain of the optimal Kalman and the smaller gain of the robust unbiased finite impulse response (UFIR) filter. That is, regardless of the robust criterion, the gain produced by the sophisticated TFA turns out to be quite sandwiched by the Kalman and UFIR filters. The filters are tested based on extensive numerical simulations and experimentally in terms of mean square error, robustness, and quality factor.</div></div>","PeriodicalId":49523,"journal":{"name":"Signal Processing","volume":"237 ","pages":"Article 110060"},"PeriodicalIF":3.4,"publicationDate":"2025-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143921910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Signal ProcessingPub Date : 2025-05-02DOI: 10.1016/j.sigpro.2025.110069
Jun-Ru Yang , Zhang-Lei Shi , Xiao-Peng Li , Wenxin Xiong , Yaru Fu , Xijun Liang
{"title":"Adaptive robust MIMO radar target localization via capped Frobenius norm","authors":"Jun-Ru Yang , Zhang-Lei Shi , Xiao-Peng Li , Wenxin Xiong , Yaru Fu , Xijun Liang","doi":"10.1016/j.sigpro.2025.110069","DOIUrl":"10.1016/j.sigpro.2025.110069","url":null,"abstract":"<div><div>Most of the existing algorithms for multiple-input multiple-output radar target localization assume that the bistatic range measurements are contaminated by one certain kind of noise only, such as Gaussian noise and impulsive noise. However, when the practical noise violates the original assumed distribution, their localization performance degrades severely. Therefore, adaptive and robust localization algorithms that can achieve good localization performance under both Gaussian and impulsive noise are highly desirable. In this paper, we exploit the truncated least squares loss function called capped Frobenius norm (CFN) to resist outliers. An adaptive update scheme is developed to automatically determine the upper bound of CFN using the normalized median absolute deviation. Then, the nonconvex and nonsmooth CFN-based formulation is transformed into a regularized <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>-norm optimization problem based on the half-quadratic theory. The alternating optimization (AO) algorithm is adopted as the solver, and closed-form solutions for both subproblems are derived. We also show that the sequence of objective function value generated by the devised algorithm converges. Experimental results verify the superiority of the proposed algorithm over several existing algorithms in terms of localization accuracy under impulsive noise. Furthermore, the devised algorithm can attain comparable performance to <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>-norm based methods without tweaking hyperparameters under Gaussian noise.</div></div>","PeriodicalId":49523,"journal":{"name":"Signal Processing","volume":"237 ","pages":"Article 110069"},"PeriodicalIF":3.4,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143917347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Signal ProcessingPub Date : 2025-04-30DOI: 10.1016/j.sigpro.2025.110070
Pengfei Fang , Wenling Li , Jia Song , Xiaoming Li , Li Ma
{"title":"Joint state estimation and topology inference for graphical dynamical systems","authors":"Pengfei Fang , Wenling Li , Jia Song , Xiaoming Li , Li Ma","doi":"10.1016/j.sigpro.2025.110070","DOIUrl":"10.1016/j.sigpro.2025.110070","url":null,"abstract":"<div><div>In this paper, we consider the problem of joint state estimation and topology inference for a class of graphical dynamical systems, where the graph topology matrix is involved in the dynamical systems. A non-convex objective function, containing an equality constraint on the row sum of the topology matrix, is established with respect to the state and the topology, in which the estimated node states and observations at the historical time steps are used to infer the graph topology, and a regularization term is designed to enhance the sparsity of the graph topology. Then, the state estimation and topology inference are obtained by solving two convex subproblems in manner of using the Kalman filtering and the alternating direction method of multipliers (ADMM) algorithms, respectively. Specially, by separating the non-differentiable regularization term and utilizing a proximity operator, we derive an iterative solution with high computational efficiency to infer the graph topology in the ADMM algorithm. To verify the effectiveness of the proposed algorithm, simulation with a car-following model is carried out.</div></div>","PeriodicalId":49523,"journal":{"name":"Signal Processing","volume":"237 ","pages":"Article 110070"},"PeriodicalIF":3.4,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143906414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Signal ProcessingPub Date : 2025-04-29DOI: 10.1016/j.sigpro.2025.110079
Jiacheng Ge , Yingqiang Qiu , Zhisheng Chen , Kaimeng Chen , Xiaodan Lin , Yufeng Dai
{"title":"Separable and high-capacity reversible data hiding for encrypted 3D mesh models based on dual multi-MSB predictions","authors":"Jiacheng Ge , Yingqiang Qiu , Zhisheng Chen , Kaimeng Chen , Xiaodan Lin , Yufeng Dai","doi":"10.1016/j.sigpro.2025.110079","DOIUrl":"10.1016/j.sigpro.2025.110079","url":null,"abstract":"<div><div>Three-dimensional (3D) models, essential for building virtual worlds, are encountering growing challenges in privacy and copyright protection as their usage increases. Reversible data hiding (RDH) in encrypted 3D mesh models not only protects the privacy of the original models through encryption but also embeds additional data for covert communication or access control. This paper proposes a high-capacity, separable RDH method for encrypted 3D models. The approach utilizes integer mapping and incorporates an enhanced dual multiple most significant bit (multi-MSB) prediction strategy to maximize embedding capacity. First, each vertex coordinate is scaled to a decimal value within a predefined range. These values are then encoded into binary digits using integer mapping, with the number of digits determined by a compression threshold. Subsequently, all vertices are processed to identify redundant data that served as embedding room using a multi-MSB self-prediction algorithm, significantly increasing the embedding capacity. Next, after disregarding the redundancy in the MSBs of each vertex, the vertices are classified into an embeddable set and a reference set. The embeddable vertices are then further processed to create additional embedding room through secondary multi-MSB prediction. The auxiliary data, compressed using arithmetic coding, is embedded into the multi-MSB of each encrypted vertex, resulting in encrypted vertices that contain both the auxiliary data and available embedding room. Using the auxiliary data, encrypted additional data is embedded into the reserved embedding room within the multi-MSB of each vertex through bit substitution. Finally, the embedded data can be extracted without errors, and the original 3D mesh can be recovered losslessly. The experimental results demonstrate that the proposed method is highly effective, achieving superior embedding capacity compared to several state-of-the-art methods.</div></div>","PeriodicalId":49523,"journal":{"name":"Signal Processing","volume":"237 ","pages":"Article 110079"},"PeriodicalIF":3.4,"publicationDate":"2025-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143906413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Transceiver beamforming design of RIS-aided active array radar in cluttered environments","authors":"Qi Feng, Shengyao Chen, Longyao Ran, Feng Xi, Hongtao Li, Sirui Tian, Zhong Liu","doi":"10.1016/j.sigpro.2025.110059","DOIUrl":"10.1016/j.sigpro.2025.110059","url":null,"abstract":"<div><div>This paper equips a reconfigurable intelligent surface (RIS) to assist the active array radar for boosting its interference suppression ability and enhancing the beamforming gain towards target direction simultaneously. The output signal-to-interference-plus-noise ratio (SINR) is chosen as the metric to jointly design the transmit and receive beamformers of radar array and RIS reflection coefficients. In light of SINR performance and implementation complexity, two operation modes using identical or distinct RIS reflection coefficients in transmit and receive stages (ITR or DTR) are investigated. In each mode, the proposed joint design is formulated into a nonconvex constrained fractional programming problem and the solving algorithm is customized under the block coordinate descent framework. Specifically, the RIS reflection coefficients are respectively optimized by the quartic Riemannian Newton method (RNM) in ITR mode and by the quadratic RNM in DTR mode after Dinkelbach transform. Moreover, a simplified scheme under DTR mode is also given to speed up processing, where the beamforming of radar array and RIS separately concentrates on the beamforming gain enhancement and interference suppression in transmit and receive stages. Numerical results display that both ITR and DTR modes significantly outperform the array radars using an RIS only in receive or transmit stage.</div></div>","PeriodicalId":49523,"journal":{"name":"Signal Processing","volume":"237 ","pages":"Article 110059"},"PeriodicalIF":3.4,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143892105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Signal ProcessingPub Date : 2025-04-28DOI: 10.1016/j.sigpro.2025.110043
Xiaoping Liu , Gong Chen , Jun Shi , Ran Tao
{"title":"An interpretable convolutional neural network via generalized time–frequency scattering","authors":"Xiaoping Liu , Gong Chen , Jun Shi , Ran Tao","doi":"10.1016/j.sigpro.2025.110043","DOIUrl":"10.1016/j.sigpro.2025.110043","url":null,"abstract":"<div><div>Convolutional neural networks (CNNs) have recently demonstrated impressive performance in complex machine learning tasks. However, the CNN requires a large quantity of annotated data to converge to a good solution, and the theoretical understanding of this network is still in its infancy. Towards this end, a variant of the CNN, dubbed the deep scattering network (DSN), has been proposed by employing the linear time–frequency transform. The DSN inherits the hierarchical structure of the CNN, but chooses predefined wavelet/Gabor filters as its convolutional kernels instead of data-driven linear filters. Unfortunately, the DSN suffers from a major drawback that it is suitable for stationary image textures but not for non-stationary image textures, since wavelet/Gabor filters are intrinsically linear translation-invariant filters. The aim of this paper is to overcome this deficiency based upon a generalized linear time–frequency transform–the short-time fractional Fourier transform (STFRFT) which can be interpreted as a bank of linear translation-variant filters and thus may be well suitable for non-stationary texture analysis. We first introduce a generalized time–frequency scattering transform using the STFRFT. By applying the derived result, we propose an interpretable CNN by cascading the STFRFTs and modulus operators. Moreover, several basic properties of the proposed interpretable CNN are derived, and an efficient implementation of this network is also presented. Finally, the applications of the derived results are discussed.</div></div>","PeriodicalId":49523,"journal":{"name":"Signal Processing","volume":"237 ","pages":"Article 110043"},"PeriodicalIF":3.4,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Signal ProcessingPub Date : 2025-04-28DOI: 10.1016/j.sigpro.2025.110035
Nawel Arab , Yassine Mhiri , Isabelle Vin , Mohammed Nabil El Korso , Pascal Larzabal
{"title":"Unrolled expectation maximization algorithm for radio interferometric imaging in presence of non Gaussian interferences","authors":"Nawel Arab , Yassine Mhiri , Isabelle Vin , Mohammed Nabil El Korso , Pascal Larzabal","doi":"10.1016/j.sigpro.2025.110035","DOIUrl":"10.1016/j.sigpro.2025.110035","url":null,"abstract":"<div><div>This paper proposes an unrolled Expectation Maximization (EM) algorithm tailored for robust radio interferometric imaging in the presence of non-Gaussian radio interferences. We introduce a compound Gaussian model for the observation noise and derive an unrolled neural architecture based on the EM algorithm to tackle the reconstruction problem in a robust manner. This innovative approach aims to enhance image reconstruction by simultaneously incorporating model information and generalization for the case of non-Gaussian heavy-tailed noise distribution, while leveraging the benefits of deep learning. Our experiments demonstrate significant improvements over state-of-the-art methods, highlighting the efficacy of our proposed scheme in handling the complexities of radiofrequency interference and improving image reconstruction accuracy.</div></div>","PeriodicalId":49523,"journal":{"name":"Signal Processing","volume":"237 ","pages":"Article 110035"},"PeriodicalIF":3.4,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143903359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}