{"title":"Distributed Stochastic Gradient Descent With Staleness: A Stochastic Delay Differential Equation Based Framework","authors":"Siyuan Yu;Wei Chen;H. Vincent Poor","doi":"10.1109/TSP.2025.3546574","DOIUrl":"10.1109/TSP.2025.3546574","url":null,"abstract":"Distributed stochastic gradient descent (SGD) has attracted considerable recent attention due to its potential for scaling computational resources, reducing training time, and helping protect user privacy in machine learning. However, stragglers and limited bandwidth may induce random computational/ communication delays, thereby severely hindering the learning process. Therefore, how to accelerate asynchronous SGD (ASGD) by efficiently scheduling multiple workers is an important issue. In this paper, a unified framework is presented to analyze and optimize the convergence of ASGD based on stochastic delay differential equations (SDDEs) and the Poisson approximation of aggregated gradient arrivals. In particular, we present the run time and staleness of distributed SGD without a memorylessness assumption on the computation times. Given the learning rate, we reveal the relevant SDDE's damping coefficient and its delay statistics, as functions of the number of activated clients, staleness threshold, the eigenvalues of the Hessian matrix of the objective function, and the overall computational/communication delay. The formulated SDDE allows us to present both the distributed SGD's convergence condition and speed by calculating its characteristic roots, thereby optimizing the scheduling policies for asynchronous/event-triggered SGD. It is interestingly shown that increasing the number of activated workers does not necessarily accelerate distributed SGD due to staleness. Moreover, a small degree of staleness does not necessarily slow down the convergence, while a large degree of staleness will result in the divergence of distributed SGD. Numerical results demonstrate the potential of our SDDE framework, even in complex learning tasks with non-convex objective functions.","PeriodicalId":13330,"journal":{"name":"IEEE Transactions on Signal Processing","volume":"73 ","pages":"1708-1726"},"PeriodicalIF":4.6,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143546460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haodong Guo;Hua Chen;Wei Liu;Songjie Yang;Chau Yuen;Hing Cheung So
{"title":"Third-Order Sum-Difference Expansion: An Array Extension Strategy Based on Third-Order Cumulants","authors":"Haodong Guo;Hua Chen;Wei Liu;Songjie Yang;Chau Yuen;Hing Cheung So","doi":"10.1109/TSP.2025.3564930","DOIUrl":"10.1109/TSP.2025.3564930","url":null,"abstract":"Recently, numerous design schemes for high-order sparse linear arrays (SLAs) have been introduced for underdetermined direction-of-arrival (DOA) estimation based on high-order cumulants, which utilize both difference co-array (DCA) and sum co-array (SCA) of the generator arrays to construct a large consecutive virtual co-array, achieving a significant increase in the number of uniform degrees-of-freedom (uDOFs). However, this processing places high demands on the generator arrays, which require both long consecutive DCA and SCA. In addition, the robustness of the derived array is prone to deterioration, due to reduced redundancy between DCA and SCA. To that end, in this paper, an alternative design scheme for third-order SLAs termed third-order sum-difference expansion (TO-SDE) is proposed, which no longer separates DCA and SCA by a shift factor, but considers them as a unified whole. In so doing, most desirable characteristics of the generator array are preserved, such as the size of consecutive virtual co-array, resistance to mutual coupling, and robustness against sensor failures, while the mapping from the sum-difference co-array based second-order SLAs to the third-order is achieved. By selecting the appropriate generator array, excellent DOA estimation performance can be attained in various scenarios.","PeriodicalId":13330,"journal":{"name":"IEEE Transactions on Signal Processing","volume":"73 ","pages":"2099-2109"},"PeriodicalIF":4.6,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143898319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Generalized Quantum State Tomography With Hybrid Denoising Priors","authors":"Duoduo Xue;Wenrui Dai;Ziyang Zheng;Chenglin Li;Junni Zou;Hongkai Xiong","doi":"10.1109/TSP.2025.3546655","DOIUrl":"10.1109/TSP.2025.3546655","url":null,"abstract":"Quantum state tomography (QST) is the gold standard for estimating the unknown state of quantum systems but suffers from the longstanding problem of exponentially growing measurements. Recent deep learning approaches alleviate the problem by training neural networks in a data-driven manner without theoretical convergence guarantees. They lack reliability in tomography quality and are restricted by retraining when generalized to extensive quantum systems with varying realizations in different environments. To address this issue, we propose the first generalized deep learning method for QST, named HD-QST, that leverages hybrid denoising priors with well-established convergence guarantees to fit extensive quantum states and environments. Hybrid denoising priors are achieved with the convex combination of analytic low-rank denoiser and neural network-based smooth denoiser dynamically determined via reinforcement learning. We demonstrate in theory that, when characterizing a quantum system, HD-QST guarantees a geometric convergence rate under the relaxed condition that the true density matrix is the fixed point of any one denoiser rather than the shared fixed point of multiple denoisers. Extensive experiments show that, compared with existing methods, HD-QST consistently obtains superior fidelity for pure and mixed quantum states in both single-device simulations and cross-device applications. Remarkably, it achieves state-of-the-art fidelity in QST of real-world W states by trapped ions and precise evaluation and comparison on cross-device QST on IBM quantum computers.","PeriodicalId":13330,"journal":{"name":"IEEE Transactions on Signal Processing","volume":"73 ","pages":"1532-1548"},"PeriodicalIF":4.6,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143526156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
James Z. Hare;Yuchen Liang;Lance M. Kaplan;Venugopal V. Veeravalli
{"title":"Bayesian Two-Sample Hypothesis Testing Using the Uncertain Likelihood Ratio: Improving the Generalized Likelihood Ratio Test","authors":"James Z. Hare;Yuchen Liang;Lance M. Kaplan;Venugopal V. Veeravalli","doi":"10.1109/TSP.2025.3546169","DOIUrl":"10.1109/TSP.2025.3546169","url":null,"abstract":"Two-sample hypothesis testing is a common practice in many fields of science, where the goal is to identify whether a set of observations and a set of training data are drawn from the same distribution. Traditionally, this is achieved using parametric and non-parametric frequentist tests, such as the Generalized Likelihood Ratio (GLR) test. However, these tests are not optimal in a Neyman-Pearson sense, especially when the number of observations and training samples are finite. Therefore, in this work, we study a parametric Bayesian test, called the Uncertain Likelihood Ratio (ULR) test, and compare its performance to the traditional GLR test. We establish that the ULR test is the optimal test for any number of samples when the parameters of the likelihood models are drawn from the true prior distribution. We then study an asymptotic form of the ULR test statistic and compare it against the GLR test statistic. As a byproduct of this analysis, we establish a new asymptotic optimality property for the GLR test when the parameters of the likelihood models are drawn from the Jeffreys prior. Furthermore, we analyze conditions under which the ULR test outperforms the GLR test, and include a numerical study to validate the results.","PeriodicalId":13330,"journal":{"name":"IEEE Transactions on Signal Processing","volume":"73 ","pages":"1410-1425"},"PeriodicalIF":4.6,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143526157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sihua Wang;Huayan Guo;Xu Zhu;Changchuan Yin;Vincent K. N. Lau
{"title":"Communication-Efficient Distributed Bayesian Federated Learning Over Arbitrary Graphs","authors":"Sihua Wang;Huayan Guo;Xu Zhu;Changchuan Yin;Vincent K. N. Lau","doi":"10.1109/TSP.2025.3546328","DOIUrl":"10.1109/TSP.2025.3546328","url":null,"abstract":"This paper investigates a fully distributed federated learning (FL) problem, in which each device is restricted to only utilize its local dataset and the information received from its adjacent devices that are defined in a communication graph to update the local model weights for minimizing the global loss function. To incorporate the communication graph constraint into the joint posterior distribution, we exploit the fact that the model weights on each device is a function of its local likelihood and local prior and then, the connectivity between adjacent devices is modeled by a Dirac Delta distribution. In this way, the joint distribution can be factorized naturally by a factor graph. Based on the Dirac Delta-based factor graph, we propose a novel distributed approximate Bayesian inference algorithm that combines loopy belief propagation (LBP) and variational Bayesian inference (VBI) for distributed FL. Specifically, VBI is used to approximate the non-Gaussian marginal posterior as a Gaussian distribution in local training process and then, the global training process resembles Gaussian LBP where only the mean and variance are passed among adjacent devices. Furthermore, we propose a new damping factor design according to the communication graph topology to mitigate the potential divergence and achieve consensus convergence. Simulation results verify that the proposed solution achieves faster convergence speed with better performance than baselines.","PeriodicalId":13330,"journal":{"name":"IEEE Transactions on Signal Processing","volume":"73 ","pages":"1351-1366"},"PeriodicalIF":4.6,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143507268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Distributed Adaptive Spatial Filtering With Inexact Local Solvers","authors":"Charles Hovine;Alexander Bertrand","doi":"10.1109/TSP.2025.3546484","DOIUrl":"10.1109/TSP.2025.3546484","url":null,"abstract":"The Distributed Adaptive Signal Fusion (DASF) framework is a meta-algorithm for computing data-driven spatial filters in a distributed sensing platform with limited bandwidth and computational resources, such as a wireless sensor network. The convergence and optimality of the DASF algorithm has been extensively studied under the assumption that an exact, but possibly impractical solver for the local optimization problem at each updating node is available. In this work, we provide convergence and optimality results for the DASF framework when used with an inexact, finite-time solver such as (proximal) gradient descent or Newton's method. We provide sufficient conditions that the solver should satisfy in order to guarantee convergence of the resulting algorithm as well as numerical simulations to validate these theoretical results.","PeriodicalId":13330,"journal":{"name":"IEEE Transactions on Signal Processing","volume":"73 ","pages":"1262-1277"},"PeriodicalIF":4.6,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143507269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning Task-Based Trainable Neuromorphic ADCs via Power-Aware Distillation","authors":"Tal Vol;Loai Danial;Nir Shlezinger","doi":"10.1109/TSP.2025.3546458","DOIUrl":"10.1109/TSP.2025.3546458","url":null,"abstract":"The ability to process signals in digital form depends on analog-to-digital converters (ADCs). Traditionally, ADCs are designed to ensure that the digital representation closely matches the analog signal. However, recent studies have shown that significant power and memory savings can be achieved through <italic>task-based acquisition</i>, where the acquisition process is tailored to the downstream processing task. An emerging technology for task-based acquisition involves the use of memristors, which are considered key enablers for neuromorphic computing. Memristors can implement ADCs with tunable mappings, allowing adaptation to specific system tasks or power constraints. In this work, we study task-based acquisition for a generic classification task using memristive ADCs. We consider the unique characteristics of this such neuromorphic ADCs, including their power consumption and noisy read-write behavior, and propose a physically compliant model based on resistive successive approximation register ADCs integrated with memristor components, enabling the adjustment of quantization regions. To optimize performance, we introduce a data-driven algorithm that jointly tunes task-based memristive ADCs alongside both digital and analog processing. Our design addresses the inherent stochasticity of memristors through power-aware distillation, complemented by a specialized learning algorithm that adapts to their unique analog-to-digital mapping. The proposed approach is shown to enhance accuracy by up to 27% and reduce power consumption by up to 66% compared to uniform ADCs. Even under noisy conditions, our method achieves substantial gains, with accuracy improvements of up to 19% and power reductions of up to 57%. These results highlight the effectiveness of our power-aware neuromorphic ADCs in improving system performance across diverse tasks.","PeriodicalId":13330,"journal":{"name":"IEEE Transactions on Signal Processing","volume":"73 ","pages":"1246-1261"},"PeriodicalIF":4.6,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143507278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Event-Triggered State Estimation Through Confidence Level","authors":"Wei Liu","doi":"10.1109/TSP.2025.3543721","DOIUrl":"10.1109/TSP.2025.3543721","url":null,"abstract":"This paper considers the state estimation problem for discrete-time linear systems under event-triggered scheme. In order to improve performance, a novel event-triggered scheme based on confidence level is proposed using the chi-square distribution and mild regularity assumption. In terms of the novel event-triggered scheme, a minimum mean squared error (MMSE) state estimator is proposed using some results presented in this paper. Two algorithms for communication rate estimation of the proposed MMSE state estimator are developed where the first algorithm is based on information with one-step delay, and the second algorithm is based on information with two-step delay. The performance and effectiveness of the proposed MMSE state estimator and the two communication rate estimation algorithms are illustrated using a target tracking scenario.","PeriodicalId":13330,"journal":{"name":"IEEE Transactions on Signal Processing","volume":"73 ","pages":"1337-1350"},"PeriodicalIF":4.6,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143495420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Relative Entropy Based Jamming Signal Design Against Radar Target Detection","authors":"Zhou Xu;Bo Tang;Weihua Ai;Jiahua Zhu","doi":"10.1109/TSP.2025.3544305","DOIUrl":"10.1109/TSP.2025.3544305","url":null,"abstract":"In modern electronic warfare, active jamming is an important way to prevent the target from being detected by the radar sensors. This paper considers the problem of designing effective jamming signals with limited jamming power. By taking the relative entropy as the figure of merit, we formulate the jamming signal design as a matrix optimization problem which is Non-Polynomial (NP) hard in general. To solve the resultant problem, we conceive an iterative algorithm, named by Relative Entropy Jamming Optimization Algorithm (REJOA), based on combining the Majorization Minimization (MM) technique and the matrix factorization together. The conceived algorithm updates the optimization variable in a closed form (or semi-closed form) at each iteration, and guarantees theoretical convergence. Finally, we compare our design with the Mutual Information (MI) based design and the Signal to Jamming plus Noise Ratio (SJNR) based design through numerical experiments. Results highlight that, compared with the state-of-the-art designs, our design achieves better jamming performance with the same jamming power.","PeriodicalId":13330,"journal":{"name":"IEEE Transactions on Signal Processing","volume":"73 ","pages":"1200-1215"},"PeriodicalIF":4.6,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143495421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reinforcement Learning Based Online Algorithm for Near-Field Time-Varying IRS Phase Shift Optimization: System Evolution Perspective","authors":"Zongtai Li;Rui Wang;Erwu Liu","doi":"10.1109/TSP.2025.3545164","DOIUrl":"10.1109/TSP.2025.3545164","url":null,"abstract":"This paper proposes a reinforcement learning (RL) based intelligent reflecting surface (IRS) incremental control algorithm for a mmWave time-varying multi-user multiple-input single-output (MU-MISO) system. The research focuses on addressing the key challenge of near-field IRS design, which involves time-varying channels due to users’ mobility. In practice, the optimization becomes more challenging when the components of the concatenated channel are unknown. From a higher perspective, we leverage electromagnetic information theory and manifold theory to provide a unified description of the IRS-assisted MU-MISO system. We regard the communication system as a nonlinear dynamic system on reproducing kernel Hilbert space (RKHS), upon which the approximate evolution operator is defined as observables for system evolution. The IRS phase shift optimization problem is modeled as a nonlinear system eigenvalue maximization problem. Utilizing the geometric properties of the unitary evolution operator, we define a metric space where the geodesic-based distance function satisfies the Lipschitz condition, enabling efficient exploitation of channel similarities. We transform the complex non-convex optimization problem into a low-dimensional linear contextual bandit problem. The performance of the proposed GLinUCB algorithm is evaluated through numerical simulations in various scenarios, showing its effectiveness in achieving high sum rates with fast convergence speed.","PeriodicalId":13330,"journal":{"name":"IEEE Transactions on Signal Processing","volume":"73 ","pages":"1231-1245"},"PeriodicalIF":4.6,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143486209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}