{"title":"Representation Synthesis by Probabilistic Many-Valued Logic Operation in Self-Supervised Learning","authors":"Hiroki Nakamura;Masashi Okada;Tadahiro Taniguchi","doi":"10.1109/OJSP.2024.3399663","DOIUrl":"https://doi.org/10.1109/OJSP.2024.3399663","url":null,"abstract":"In this paper, we propose a new self-supervised learning (SSL) method for representations that enable logic operations. Representation learning has been applied to various tasks like image generation and retrieval. The logical controllability of representations is important for these tasks. Although some methods have been shown to enable the intuitive control of representations using natural languages as the inputs, representation control via logic operations between representations has not been demonstrated. Some SSL methods using representation synthesis (e.g., elementwise mean and maximum operations) have been proposed, but the operations performed in these methods do not incorporate logic operations. In this work, we propose a logic-operable self-supervised representation learning method by replacing the existing representation synthesis with the OR operation on the probabilistic extension of many-valued logic. The representations comprise a set of feature-possession degrees, which are truth values indicating the presence or absence of each feature in the image, and realize the logic operations (e.g., OR and AND). Our method can generate a representation that has the features of both representations or only those features common to both representations. Furthermore, the expression of the ambiguous presence of a feature is realized by indicating the feature-possession degree by the probability distribution of truth values of the many-valued logic. We showed that our method performs competitively in single and multi-label classification tasks compared with prior SSL methods using synthetic representations. Moreover, experiments on image retrieval using MNIST and PascalVOC showed the representations of our method can be operated by OR and AND operations.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"5 ","pages":"831-840"},"PeriodicalIF":2.9,"publicationDate":"2024-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10528856","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141543877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Weijie Gan;Qiuchen Zhai;Michael T. McCann;Cristina Garcia Cardona;Ulugbek S. Kamilov;Brendt Wohlberg
{"title":"PtychoDV: Vision Transformer-Based Deep Unrolling Network for Ptychographic Image Reconstruction","authors":"Weijie Gan;Qiuchen Zhai;Michael T. McCann;Cristina Garcia Cardona;Ulugbek S. Kamilov;Brendt Wohlberg","doi":"10.1109/OJSP.2024.3375276","DOIUrl":"https://doi.org/10.1109/OJSP.2024.3375276","url":null,"abstract":"Ptychography is an imaging technique that captures multiple overlapping snapshots of a sample, illuminated coherently by a moving localized probe. The image recovery from ptychographic data is generally achieved via an iterative algorithm that solves a nonlinear phase retrieval problem derived from measured diffraction patterns. However, these iterative approaches have high computational cost. In this paper, we introduce PtychoDV, a novel deep model-based network designed for efficient, high-quality ptychographic image reconstruction. PtychoDV comprises a vision transformer that generates an initial image from the set of raw measurements, taking into consideration their mutual correlations. This is followed by a deep unrolling network that refines the initial image using learnable convolutional priors and the ptychography measurement model. Experimental results on simulated data demonstrate that PtychoDV is capable of outperforming existing deep learning methods for this problem, and significantly reduces computational cost compared to iterative methodologies, while maintaining competitive performance.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"5 ","pages":"539-547"},"PeriodicalIF":0.0,"publicationDate":"2024-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10463649","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140621193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards a Geometric Understanding of Spatiotemporal Graph Convolution Networks","authors":"Pratyusha Das;Sarath Shekkizhar;Antonio Ortega","doi":"10.1109/OJSP.2024.3396635","DOIUrl":"https://doi.org/10.1109/OJSP.2024.3396635","url":null,"abstract":"Spatiotemporal graph convolutional networks (STGCNs) have emerged as a desirable model for \u0000<italic>skeleton</i>\u0000-based human action recognition. Despite achieving state-of-the-art performance, there is a limited understanding of the representations learned by these models, which hinders their application in critical and real-world settings. While layerwise analysis of CNN models has been studied in the literature, to the best of our knowledge, there exists \u0000<italic>no study</i>\u0000 on the layerwise explainability of the embeddings learned on spatiotemporal data using STGCNs. In this paper, we first propose to use a local Dataset Graph (DS-Graph) obtained from the feature representation of input data at each layer to develop an understanding of the layer-wise embedding geometry of the STGCN. To do so, we develop a window-based dynamic time warping (DTW) method to compute the distance between data sequences with varying temporal lengths. To validate our findings, we have developed a layer-specific Spatiotemporal Graph Gradient-weighted Class Activation Mapping (L-STG-GradCAM) technique tailored for spatiotemporal data. This approach enables us to visually analyze and interpret each layer within the STGCN network. We characterize the functions learned by each layer of the STGCN using the label smoothness of the representation and visualize them using our L-STG-GradCAM approach. Our proposed method is generic and can yield valuable insights for STGCN architectures in different applications. However, this paper focuses on the human activity recognition task as a representative application. Our experiments show that STGCN models learn representations that capture general human motion in their initial layers while discriminating different actions only in later layers. This justifies experimental observations showing that fine-tuning deeper layers works well for transfer between related tasks. We provide experimental evidence for different human activity datasets and advanced spatiotemporal graph networks to validate that the proposed method is general enough to analyze any STGCN model and can be useful for drawing insight into networks in various scenarios. We also show that noise at the input has a limited effect on label smoothness, which can help justify the robustness of STGCNs to noise.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"5 ","pages":"1023-1030"},"PeriodicalIF":2.9,"publicationDate":"2024-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10518107","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142316397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jinzheng Zhao;Yong Xu;Xinyuan Qian;Haohe Liu;Mark D. Plumbley;Wenwu Wang
{"title":"Attention-Based End-to-End Differentiable Particle Filter for Audio Speaker Tracking","authors":"Jinzheng Zhao;Yong Xu;Xinyuan Qian;Haohe Liu;Mark D. Plumbley;Wenwu Wang","doi":"10.1109/OJSP.2024.3363649","DOIUrl":"https://doi.org/10.1109/OJSP.2024.3363649","url":null,"abstract":"Particle filters (PFs) have been widely used in speaker tracking due to their capability in modeling a non-linear process or a non-Gaussian environment. However, particle filters are limited by several issues. For example, pre-defined handcrafted measurements are often used which can limit the model performance. In addition, the transition and update models are often preset which make PF less flexible to be adapted to different scenarios. To address these issues, we propose an end-to-end differentiable particle filter framework by employing the multi-head attention to model the long-range dependencies. The proposed model employs the self-attention as the learned transition model and the cross-attention as the learned update model. To our knowledge, this is the first proposal of combining particle filter and transformer for speaker tracking, where the measurement extraction, transition and update steps are integrated into an end-to-end architecture. Experimental results show that the proposed model achieves superior performance over the recurrent baseline models.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"5 ","pages":"449-458"},"PeriodicalIF":0.0,"publicationDate":"2024-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10428039","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139976169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hanqing Gu;Lisheng Su;Yuxia Wang;Weifeng Zhang;Chuan Ran
{"title":"Efficient Channel-Temporal Attention for Boosting RF Fingerprinting","authors":"Hanqing Gu;Lisheng Su;Yuxia Wang;Weifeng Zhang;Chuan Ran","doi":"10.1109/OJSP.2024.3362695","DOIUrl":"https://doi.org/10.1109/OJSP.2024.3362695","url":null,"abstract":"In recent years, Deep Convolutional Neural Networks (DCNNs) have been widely used to solve Radio Frequency (RF) fingerprinting task. DCNNs are capable of learning the proper convolution kernels driven by data and directly extracting RF fingerprints from raw In-phase/Quadratur (IQ) data which are brought by variations or minor flaws in transmitters' circuits, enabling the identification of a specific transmitter. One of the main challenges in employing this sort of technology is how to optimize model design so that it can automatically learn discriminative RF fingerprints and show robustness to changes in environmental factors. To this end, this paper proposes \u0000<italic>ECTAttention</i>\u0000, an \u0000<bold>E</b>\u0000fficient \u0000<bold>C</b>\u0000hannel-\u0000<bold>T</b>\u0000emporal \u0000<bold>A</b>\u0000ttention block that can be used to enhance the feature learning capability of DCNNs. \u0000<italic>ECTAttention</i>\u0000 has two parallel branches. On the one hand, it automatically mines the correlation between channels through channel attention to discover and enhance important convolution kernels. On the other hand, it can recalibrate the feature map through temporal attention. \u0000<italic>ECTAttention</i>\u0000 has good flexibility and high efficiency, and can be combined with existing DCNNs to effectively enhance their feature learning ability on the basis of increasing only a small amount of computational consumption, so as to achieve high precision of RF fingerprinting. Our experimental results show that ResNet enhanced by \u0000<italic>ECTAttention</i>\u0000 can identify 10 USRP X310 SDRs with an accuracy of 97.5%, and achieve a recognition accuracy of 91.9% for 56 actual ADS-B signal sources under unconstrained acquisition environment.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"5 ","pages":"478-492"},"PeriodicalIF":0.0,"publicationDate":"2024-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10423213","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139987098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Family of Swish Diffusion Strategy Based Adaptive Algorithms for Distributed Active Noise Control","authors":"Rajapantula Kranthi;Vasundhara;Asutosh Kar;Mads Græsbøll Christensen","doi":"10.1109/OJSP.2024.3360860","DOIUrl":"https://doi.org/10.1109/OJSP.2024.3360860","url":null,"abstract":"The conventional filtered-x least mean square (F-xLMS) algorithm based distributed active noise control (DANC) system's performance suffers in the presence of outliers and impulse like disturbances. In an attempt to reduce noise in such an environment Swish function based algorithms for DANC systems have been proposed presently. The Swish function makes use of the smoothness and unboundedness properties for faster convergence and eliminating vanishing gradient issue. The intention is to employ the smooth approximation of Softplus and the non-convex property of Geman-McClure estimator to propose a Softplus Geman-McClure function. In addition, the bounded nonlinearity of Welsch function which is insensitive to the outliers is utilized with the regularization property of Softsign formulating Softsign Welsch method. Henceforth, this paper proposes a family of robust algorithms employing the Swish diffusion strategy for filtered-x sign, filtered-x LMS, filtered-x Softplus Geman-McClure and filtered-x Softsign Welsch algorithms for DANC systems. The weight update rules are derived for the proposed algorithms and convergence analysis is also carried out. The suggested methods achieve faster convergence in comparison with existing techniques and approximately 1–5 dB improvement in noise cancellation for various noise inputs and impulsive noise interferences.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"5 ","pages":"503-519"},"PeriodicalIF":0.0,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10418455","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140291176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Correlated Sparse Bayesian Learning for Recovery of Block Sparse Signals With Unknown Borders","authors":"Didem Dogan;Geert Leus","doi":"10.1109/OJSP.2024.3360914","DOIUrl":"https://doi.org/10.1109/OJSP.2024.3360914","url":null,"abstract":"We consider the problem of recovering complex-valued block sparse signals with unknown borders. Such signals arise naturally in numerous applications. Several algorithms have been developed to solve the problem of unknown block partitions. In pattern-coupled sparse Bayesian learning (PCSBL), each coefficient involves its own hyperparameter and those of its immediate neighbors to exploit the block sparsity. Extended block sparse Bayesian learning (EBSBL) assumes the block sparse signal consists of correlated and overlapping blocks to enforce block correlations. We propose a simpler alternative to EBSBL and reveal the underlying relationship between the proposed method and a particular case of EBSBL. The proposed algorithm uses the fact that immediate neighboring sparse coefficients are correlated. The proposed model is similar to classical sparse Bayesian learning (SBL). However, unlike the diagonal correlation matrix in conventional SBL, the unknown correlation matrix has a tridiagonal structure to capture the correlation with neighbors. Due to the entanglement of the elements in the inverse tridiagonal matrix, instead of a direct closed-form solution, an approximate solution is proposed. The alternative algorithm avoids the high dictionary coherence in EBSBL, reduces the unknowns of EBSBL, and is computationally more efficient. The sparse reconstruction performance of the algorithm is evaluated with both correlated and uncorrelated block sparse coefficients. Simulation results demonstrate that the proposed algorithm outperforms PCSBL and correlation-based methods such as EBSBL in terms of reconstruction quality. The numerical results also show that the proposed correlated SBL algorithm can deal with isolated zeros and nonzeros as well as block sparse patterns.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"5 ","pages":"421-435"},"PeriodicalIF":0.0,"publicationDate":"2024-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10417118","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139749902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xudong Dong;Jun Zhao;Jingjing Pan;Meng Sun;Xiaofei Zhang;Peihao Dong;Yide Wang
{"title":"DOA Estimation With Nested Arrays in Impulsive Noise Scenario: An Adaptive Order Moment Strategy","authors":"Xudong Dong;Jun Zhao;Jingjing Pan;Meng Sun;Xiaofei Zhang;Peihao Dong;Yide Wang","doi":"10.1109/OJSP.2024.3360896","DOIUrl":"https://doi.org/10.1109/OJSP.2024.3360896","url":null,"abstract":"Most of the existing direction of arrival (DOA) estimation methods in impulsive noise scenario are based on the fractional low-order moment statistics (FLOSs), such as the robust covariation-based (ROC), fractional low-order moment (FLOM), and phased fractional low-order moment (PFLOM). However, an unknown order moment parameter \u0000<inline-formula><tex-math>$p$</tex-math></inline-formula>\u0000 needs to be selected in these approaches, which inevitably increases the computational load if the optimal value of the parameter \u0000<inline-formula><tex-math>$p$</tex-math></inline-formula>\u0000 is determined by a large number of Monte Carlo experiments. To address this issue, we propose the adaptive order moment function (AOMF) and improved AOMF (IAOMF), which are applicable to the existing FLOSs-based methods and can also be extended to the case of sparse arrays. Moreover, we analyze the performance of AOMF and IAOMF, and simulation experiments verify the effectiveness of proposed methods.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"5 ","pages":"493-502"},"PeriodicalIF":0.0,"publicationDate":"2024-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10417125","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139987096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Constrained Weighted Least-Squares Algorithms for 3-D AOA-Based Hybrid Localization","authors":"Yanbin Zou;Wenbo Wu;Jingna Fan;Huaping Liu","doi":"10.1109/OJSP.2024.3360901","DOIUrl":"https://doi.org/10.1109/OJSP.2024.3360901","url":null,"abstract":"Source localization with time-of-arrival (TOA), time-difference-of-arrival (TDOA), time-delay (TD), received-signal-strength (RSS), or angle-of-arrival (AOA) measurements from several spatially distributed sensors is commonly used in practice. Existing analysis of the Cram \u0000<inline-formula><tex-math>$acute{text{e}}$</tex-math></inline-formula>\u0000 r-Rao lower bounds (CRLB) shows that a hybrid of two or more independent kinds of measurement has a lower CRLB than one individual type of measurement. This paper develops a unified constrained weighted-least squares (CWLS) algorithm for five types of hybrid localization systems: AOA and TOA (AOA/TOA), AOA and TDOA (AOA/TDOA), AOA and TD (AOA/TD), AOA and RSS (AOA/RSS), AOA, TOA, and RSS (AOA/TOA/RSS). These formulated CWLS problems only have one quadratic constraint, which can be effectively solved by the Lagrange multiplier method and root-finding algorithm. Extensive simulation results show that the proposed CWLS algorithms are superior to state-of-the-art algorithms and reach the CRLB.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"5 ","pages":"436-448"},"PeriodicalIF":0.0,"publicationDate":"2024-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10417139","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139749901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimum Waveform Selection for Target State Estimation in the Joint Radar-Communication System","authors":"Ashoka Chakravarthi Mahipathi;Bethi Pardha Pardhasaradhi;Srinath Gunnery;Pathipati Srihari;John d'Souza;Paramananda Jena","doi":"10.1109/OJSP.2024.3359997","DOIUrl":"https://doi.org/10.1109/OJSP.2024.3359997","url":null,"abstract":"The widespread usage of the Radio Frequency (RF) spectrum for wireless and mobile communication systems generated a significant spectrum scarcity. The Joint Radar-Communication System (JRCS) provides a framework to simultaneously utilize the allocated radar spectrum for sensing and communication purposes. Generally, a Successive Interference Cancellation (SIC) based receiver is applied to mitigate mutual interference in the JRCS configuration. However, this SIC receiver model introduces a communication residual component. In response to this issue, the article presents a novel measurement model based on communication residual components for various radar waveforms. The radar system's performance within the JRCS framework is then evaluated using the Fisher Information Matrix (FIM). The radar waveforms considered in this investigation are rectangular pulse, triangular pulse, Gaussian pulse, Linear Frequency Modulated (LFM) pulse, LFM-Gaussian pulse, and Non-Linear Frequency Modulated (NLFM) pulse. After that, the Kalman filter is deployed to estimate the target kinematics (range and range rate) of a single linearly moving target for different waveforms. Additionally, range and range rate estimation errors are quantified using the Root Mean Square Error (RMSE) metric. Furthermore, the Posterior Cramer-Rao Lower Bound (PCRLB) is derived to validate the estimation accuracy of various waveforms. The simulation results show that the range and range rate estimation errors are within the PCRLB limit at all time instants for all the designated waveforms. The results further reveal that the NLFM pulse waveform provides improved range and range rate error performance compared to all other waveforms.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"5 ","pages":"459-477"},"PeriodicalIF":0.0,"publicationDate":"2024-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10416352","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139987097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}