{"title":"Harnessing the Multi-Phasal Nature of Speech-EEG for Enhancing Imagined Speech Recognition","authors":"Rini Sharon;Mriganka Sur;Hema Murthy","doi":"10.1109/OJSP.2025.3528368","DOIUrl":"https://doi.org/10.1109/OJSP.2025.3528368","url":null,"abstract":"Analyzing speech-electroencephalogram (EEG) is pivotal for developing non-invasive and naturalistic brain-computer interfaces. Recognizing that the nature of human communication involves multiple phases like audition, imagination, articulation, and production, this study uncovers the shared cognitive imprints that represent speech cognition across these phases. Regression analysis, using correlation metrics reveal pronounced inter-phasal congruence. This insight promotes a shift from single-phase-centric recognition models to harnessing integrated phase data, thereby enhancing recognition of cognitive speech. Having established the presence of inter-phase associations, a common representation learning feature extractor is introduced, adept at capturing the correlations and replicability across phases. The features so extracted are observed to provide superior discrimination of cognitive speech units. Notably, the proposed approach proves resilient even in the absence of comprehensive multi-phasal data. Through thorough control checks and illustrative topographical visualizations, our observations are substantiated. The findings indicate that the proposed multi-phase approach significantly enhances EEG-based speech recognition, achieving an accuracy gain of 18.2% for 25 cognitive units in continuous speech EEG over models reliant solely on single-phase data.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"6 ","pages":"78-88"},"PeriodicalIF":2.9,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10839023","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143379498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Integrated Gaussian Processes for Tracking","authors":"Fred Lydeard;Bashar I. Ahmad;Simon Godsill","doi":"10.1109/OJSP.2025.3529308","DOIUrl":"https://doi.org/10.1109/OJSP.2025.3529308","url":null,"abstract":"In applications such as tracking and localisation, a dynamical model is typically specified for the modelling of an object's motion. An appealing alternative to the traditional parametric Markovian dynamical models is the Gaussian Process (GP). GPs can offer additional flexibility and represent non-Markovian, long-term, dependencies in the target's kinematics. However, a standard GP with constant or zero mean is prone to oscillating around its mean and not sufficiently exploring the state space. In this paper, we consider extensions of the common GP framework such that a GP acts as the driving <italic>disturbance</i> term that is integrated over time to produce a new Integrated GP (iGP) dynamical model. It potentially provides a more realistic modelling of agile objects' behaviour. We prove here that the introduced iGP model is, itself, a GP with a non-stationary kernel, which we derive fully in the case of the squared exponential GP kernel. Thus, the iGP is straightforward to implement, with the usual growth over time of the computational burden. We further show how to implement the model with fixed time complexity in a standard sequential Bayesian updating framework using Kalman filter-based computations, employing a sliding window Markovian approximation. Example results from real radar measurements and synthetic data are presented to demonstrate the ability of the proposed iGP modelling to facilitate more accurate tracking compared to conventional GP.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"6 ","pages":"99-107"},"PeriodicalIF":2.9,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10839315","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143379516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Survey on Machine Learning Techniques for Head-Related Transfer Function Individualization","authors":"Davide Fantini;Michele Geronazzo;Federico Avanzini;Stavros Ntalampiras","doi":"10.1109/OJSP.2025.3528330","DOIUrl":"https://doi.org/10.1109/OJSP.2025.3528330","url":null,"abstract":"Machine learning (ML) has become pervasive in various research fields, including binaural synthesis personalization, which is crucial for sound in immersive virtual environments. Researchers have mainly addressed this topic by estimating the individual head-related transfer function (HRTF). HRTFs are utilized to render audio signals at specific spatial positions, thereby simulating real-world sound wave interactions with the human body. As such, an HRTF that is compliant with individual characteristics enhances the realism of the binaural simulation. This survey systematically examines the HRTF individualization works based on ML proposed in the literature. The analyzed works are organized according to the processing steps involved in the ML workflow, including the employed dataset, input and output types, data preprocessing operations, ML models, and model evaluation. In addition to categorizing the works of the existing literature, this survey discusses their achievements, identifies their limitations, and outlines aspects that require further investigation at the crossroads of research communities in acoustics, audio signal processing, and machine learning.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"6 ","pages":"30-56"},"PeriodicalIF":2.9,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10836943","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143107176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The ICASSP 2024 Audio Deep Packet Loss Concealment Grand Challenge","authors":"Lorenz Diener;Solomiya Branets;Ando Saabas;Ross Cutler","doi":"10.1109/OJSP.2025.3526552","DOIUrl":"https://doi.org/10.1109/OJSP.2025.3526552","url":null,"abstract":"Audio packet loss concealment hides gaps in VoIP audio streams caused by network packet loss. It operates in real-time with low computational requirements and latency, as demanded by modern communication systems. With the ICASSP 2024 Audio Deep Packet Loss Concealment Grand Challenge, we build on the success of the previous Audio PLC Challenge held at INTERSPEECH 2022. For the 2024 challenge at ICASSP, we update the challenge by introducing an overall harder blind evaluation set and extending the task from wideband to fullband audio, in keeping with current trends in internet telephony. In addition to the Word Accuracy metric, we also use a questionnaire based on an extension of ITU-T P.804 to more closely evaluate the performance of systems specifically on the PLC task. We evaluate a total of 9 systems submitted by different academic and industry teams, 8 of which satisfy the strict real-time performance requirements of the challenge, using both P.804 and Word Accuracy evaluations. Two systems share first place, with one of the systems having the advantage in terms of naturalness, while the other wins in terms of intelligibility. These systems are the current state of the art for Deep PLC.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"6 ","pages":"231-237"},"PeriodicalIF":2.9,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10830479","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143471127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ICASSP 2024 Speech Signal Improvement Challenge","authors":"Nicolae-Cătălin Ristea;Babak Naderi;Ando Saabas;Ross Cutler;Sebastian Braun;Solomiya Branets","doi":"10.1109/OJSP.2025.3526550","DOIUrl":"https://doi.org/10.1109/OJSP.2025.3526550","url":null,"abstract":"The ICASSP 2024 Speech Signal Improvement Challenge aims to advance research in enhancing speech signal quality within communication systems. The speech signal quality can be assessed using the SIG metric from ITU-T P.835 and still remains a top issue in audio communication and conferencing systems. For example, in the ICASSP 2023 Deep Noise Suppression Challenge, the improvement in the background and overall quality is impressive, while the speech signal enhancement was not statistically significant. To improve the speech signal the following speech impairment areas must be addressed: coloration, discontinuity, loudness, reverberation, and noise. To this end, we organized ICASSP 2024 Speech Signal Improvement Challenge, which marks the second signal-focused challenge, built upon the success of the previous ICASSP 2023 Speech Signal Improvement Challenge. A training and test set was provided for the challenge, and the winners were determined using an extended crowdsourced implementation of ITU-T P.804’s listening phase and the word accuracy (WAcc) rate. The results show that significant improvement was made across all measured dimensions of speech quality.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"6 ","pages":"238-246"},"PeriodicalIF":2.9,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10830509","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143471128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adversarial Robust Modulation Recognition Guided by Attention Mechanisms","authors":"Quanhai Zhan;Xiongwei Zhang;Meng Sun;Lei Song;Zhenji Zhou","doi":"10.1109/OJSP.2025.3526577","DOIUrl":"https://doi.org/10.1109/OJSP.2025.3526577","url":null,"abstract":"Deep neural networks have demonstrated considerable effectiveness in recognizing complex communications signals through their applications in the tasks of automatic modulation recognition. However, the resilience of these networks is undermined by the introduction of carefully designed adversarial examples that compromise the reliability of the decision processes. In order to address this issue, an Attention-Guided Automatic Modulation Recognition (AG-AMR) method is proposed in this paper. The method introduces an optimized attention mechanism within the Transformer framework, where signal features are extracted and filtered based on the weights of the attention module during the training process, which makes the model to focus on key features for the task. Furthermore, by removing features of low importance where adversarial perturbations may appear, the proposed method mitigates the negative impacts of adversarial perturbations on modulation classification, thereby it improves both accuracy and robustness. Experimental results on benchmark datasets show that AG-AMR obtains a high level of accuracy on modulation recognition and exhibits significant robustness. Furthermore, when working together with adversarial training, it is shown that AG-AMR effectively resists several existing adversarial attacks, which thus further validates its effectiveness on defending against adversarial sample attacks.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"6 ","pages":"17-29"},"PeriodicalIF":2.9,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10829960","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143107175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modular Hypernetworks for Scalable and Adaptive Deep MIMO Receivers","authors":"Tomer Raviv;Nir Shlezinger","doi":"10.1109/OJSP.2025.3526548","DOIUrl":"https://doi.org/10.1109/OJSP.2025.3526548","url":null,"abstract":"Deep neural networks (DNNs) were shown to facilitate the operation of uplink multiple-input multiple-output (MIMO) receivers, with emerging architectures augmenting modules of classic receiver processing. Current designs employ static DNNs, whose architecture is fixed and weights are pre-trained. This poses a notable challenge, as the resulting MIMO receiver is suitable for a given configuration, i.e., channel distribution and number of users, while in practice these parameters change frequently with network variations and users leaving and joining the network. In this work, we tackle this core challenge of DNN-aided MIMO receivers. We build upon the concept of <italic>hypernetworks</i>, augmenting the receiver with a pre-trained deep model whose purpose is to update the weights of the DNN-aided receiver in response to instantaneous channel variations. We design our hypernetwork to augment <italic>modular</i> deep receivers, leveraging their modularity to have the hypernetwork adapt not only the weights, but also the architecture. Our modular hypernetwork leads to a DNN-aided receiver whose architecture and resulting complexity adapt to the number of users, as well as to channel variations, without re-training. Our numerical studies demonstrate superior error-rate performance of modular hypernetworks in time-varying channels compared to static pre-trained receivers, while providing rapid adaptivity and scalability to network variations.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"6 ","pages":"256-265"},"PeriodicalIF":2.9,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10830517","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143465816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bayesian Sound Field Estimation Using Moving Microphones","authors":"Jesper Brunnström;Martin Bo møLler;Marc Moonen","doi":"10.1109/OJSP.2025.3526546","DOIUrl":"https://doi.org/10.1109/OJSP.2025.3526546","url":null,"abstract":"A sound field estimation method using moving microphones is proposed. The sound field is modelled with an infinite sequence of spherical harmonic coefficients for each frequency in the discrete Fourier transform domain, and the harmonic coefficients are estimated using Bayesian inference. The proposed method allows for the use of microphones with arbitrary directivities and moving along any trajectory. It can also be adapted for simultaneous measurement with multiple loudspeakers. The approach avoids truncation of the harmonic coefficients, as well as the specification of an expansion center. The proposed method is evaluated through simulations and compared against other state-of-the-art methods. The proposed method is shown to produce estimates with low error, and can be applied in a wider range of situations compared to other methods.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"6 ","pages":"312-322"},"PeriodicalIF":2.9,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10830478","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143489188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"List of Reviewers","authors":"","doi":"10.1109/OJSP.2024.3498352","DOIUrl":"https://doi.org/10.1109/OJSP.2024.3498352","url":null,"abstract":"","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"5 ","pages":"1153-1155"},"PeriodicalIF":2.9,"publicationDate":"2024-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10799210","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142821148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Charbonnier Quasi Hyperbolic Momentum Spline Based Incremental Strategy for Nonlinear Distributed Active Noise Control","authors":"Rajapantula Kranthi;Vasundhara;Asutosh Kar;Mads Græsbøll Christensen","doi":"10.1109/OJSP.2024.3501774","DOIUrl":"https://doi.org/10.1109/OJSP.2024.3501774","url":null,"abstract":"Noise mitigation proves to be a challenging task for active noise control in the existence of nonlinearities. In such environments, functional link neural network (FLN) and adaptive exponential FLN techniques improve the performance of distributed active noise control systems. Nonlinear spline approaches are well known for their low computational complexity and ability to effectively alleviate noise in nonlinear systems. This paper proposes a new cost function for distributed active noise control (DANC) system which is based on the Charbonnier quasi hyperbolic momentum spline (CQHMS) involving incremental approach. This incremental based CQHMS DANC method employs Charbonnier loss and quasi hyperbolic momentum approach which minimizes gradient variance and local crossover points in order to enhance the convergence and steady-state performance. The technique being proposed demonstrates enhanced performance and achieves accelerated convergence when compared to existing techniques in a range of nonlinear DANC scenarios in lieu of varied nonlinear primary path and nonlinear secondary path conditions.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"6 ","pages":"1-15"},"PeriodicalIF":2.9,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10759299","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142937903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}