IEEE open journal of signal processing最新文献

筛选
英文 中文
L3DAS23: Learning 3D Audio Sources for Audio-Visual Extended Reality L3DAS23:为视听扩展现实学习 3D 音频源
IF 2.9
IEEE open journal of signal processing Pub Date : 2024-03-12 DOI: 10.1109/OJSP.2024.3376297
Riccardo F. Gramaccioni;Christian Marinoni;Changan Chen;Aurelio Uncini;Danilo Comminiello
{"title":"L3DAS23: Learning 3D Audio Sources for Audio-Visual Extended Reality","authors":"Riccardo F. Gramaccioni;Christian Marinoni;Changan Chen;Aurelio Uncini;Danilo Comminiello","doi":"10.1109/OJSP.2024.3376297","DOIUrl":"https://doi.org/10.1109/OJSP.2024.3376297","url":null,"abstract":"The primary goal of the L3DAS (Learning 3D Audio Sources) project is to stimulate and support collaborative research studies concerning machine learning techniques applied to 3D audio signal processing. To this end, the L3DAS23 Challenge, presented at IEEE ICASSP 2023, focuses on two spatial audio tasks of paramount interest for practical uses: 3D speech enhancement (3DSE) and 3D sound event localization and detection (3DSELD). Both tasks are evaluated within augmented reality applications. The aim of this paper is to describe the main results obtained from this challenge. We provide the L3DAS23 dataset, which comprises a collection of first-order Ambisonics recordings in reverberant simulated environments. Indeed, we maintain some general characteristics of the previous L3DAS challenges, featuring a pair of first-order Ambisonics microphones to capture the audio signals and involving multiple-source and multiple-perspective Ambisonics recordings. However, in this new edition, we introduce audio-visual scenarios by including images that depict the frontal view of the environments as captured from the perspective of the microphones. This addition aims to enrich the challenge experience, giving participants tools for exploring a combination of audio and images for solving the 3DSE and 3DSELD tasks. In addition to a brand-new dataset, we provide updated baseline models designed to take advantage of audio-image pairs. To ensure accessibility and reproducibility, we also supply supporting API for an effortless replication of our results. Lastly, we present the results achieved by the participants of the L3DAS23 Challenge.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"5 ","pages":"632-640"},"PeriodicalIF":2.9,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10468560","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141447968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Auditory EEG Decoding Challenge for ICASSP 2023 2023 年国际听觉、视觉和听觉科学大会听觉脑电图解码挑战赛
IF 2.9
IEEE open journal of signal processing Pub Date : 2024-03-12 DOI: 10.1109/OJSP.2024.3376296
Mohammad Jalilpour Monesi;Lies Bollens;Bernd Accou;Jonas Vanthornhout;Hugo Van Hamme;Tom Francart
{"title":"Auditory EEG Decoding Challenge for ICASSP 2023","authors":"Mohammad Jalilpour Monesi;Lies Bollens;Bernd Accou;Jonas Vanthornhout;Hugo Van Hamme;Tom Francart","doi":"10.1109/OJSP.2024.3376296","DOIUrl":"https://doi.org/10.1109/OJSP.2024.3376296","url":null,"abstract":"This paper describes the auditory EEG challenge, organized as one of the Signal Processing Grand Challenges at ICASSP 2023. The challenge provides EEG recordings of 85 subjects who listened to continuous speech, as audiobooks or podcasts, while their brain activity was recorded. EEG recordings of 71 subjects were provided as a training set such that challenge participants could train their models on a relatively large dataset. The remaining 14 subjects were used as held-out subjects in evaluating the challenge. The challenge consists of two tasks that relate electroencephalogram (EEG) signals to the presented speech stimulus. The first task, match-mismatch, aims to determine which of two speech segments induced a given EEG segment. In the second regression task, the goal is to reconstruct the speech envelope from the EEG. For the match-mismatch task, the performance of different teams was close to the baseline model, and the models did generalize well to unseen subjects. In contrast, For the regression task, the top teams significantly improved over the baseline models in the held-out stories test set while failing to generalize to unseen subjects.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"5 ","pages":"652-661"},"PeriodicalIF":2.9,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10468639","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141447999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Person Identification and Relapse Detection From Continuous Recordings of Biosignals Challenge: Overview and Results 从连续记录的生物信号挑战中进行人员识别和复发检测:概述和结果
IF 2.9
IEEE open journal of signal processing Pub Date : 2024-03-12 DOI: 10.1109/OJSP.2024.3376300
Athanasia Zlatintsi;Panagiotis P. Filntisis;Niki Efthymiou;Christos Garoufis;George Retsinas;Thomas Sounapoglou;Ilias Maglogiannis;Panayiotis Tsanakas;Nikolaos Smyrnis;Petros Maragos
{"title":"Person Identification and Relapse Detection From Continuous Recordings of Biosignals Challenge: Overview and Results","authors":"Athanasia Zlatintsi;Panagiotis P. Filntisis;Niki Efthymiou;Christos Garoufis;George Retsinas;Thomas Sounapoglou;Ilias Maglogiannis;Panayiotis Tsanakas;Nikolaos Smyrnis;Petros Maragos","doi":"10.1109/OJSP.2024.3376300","DOIUrl":"https://doi.org/10.1109/OJSP.2024.3376300","url":null,"abstract":"This paper presents an overview of the e-Prevention: Person Identification and Relapse Detection Challenge, which was an open call for researchers at ICASSP-2023. The challenge aimed at the analysis and processing of long-term continuous recordings of biosignals recorded from wearable sensors, namely accelerometers, gyroscopes and heart rate monitors embedded in smartwatches, as well as sleep information and daily step counts, in order to extract high-level representations of the wearer's activity and behavior, termed as digital phenotypes. Specifically, with the goal of analyzing the ability of these digital phenotypes in quantifying behavioral patterns, two tasks were evaluated in two distinct tracks: 1) Identification of the wearer of the smartwatch, and 2) Detection of psychotic relapses in patients in the psychotic spectrum. The long-term data that have been used in this challenge have been acquired during the course of the e-Prevention project (Zlatintsi et al., 2022), an innovative integrated system for medical support that facilitates effective monitoring and relapse prevention in patients with mental disorders. Two baseline systems, one for each task, were described and the validation scores for both tasks were provided to the participants. Herein, we present an overview of the approaches and methods as well as the performance analysis and the results of the 5-top ranked participating teams, which in track 1 achieved accuracy results between 91%-95%, while in track 2 mean PR- and ROC-AUC scores between 0.6051 and 0.6489 were obtained. Finally, we also make the datasets publicly available at \u0000<uri>https://robotics.ntua.gr/eprevention-sp-challenge/</uri>\u0000.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"5 ","pages":"641-651"},"PeriodicalIF":2.9,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10470363","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141447969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ICASSP 2023 Speech Signal Improvement Challenge ICASSP 2023 语音信号改进挑战赛
IF 2.9
IEEE open journal of signal processing Pub Date : 2024-03-12 DOI: 10.1109/OJSP.2024.3376293
Ross Cutler;Ando Saabas;Babak Naderi;Nicolae-Cătălin Ristea;Sebastian Braun;Solomiya Branets
{"title":"ICASSP 2023 Speech Signal Improvement Challenge","authors":"Ross Cutler;Ando Saabas;Babak Naderi;Nicolae-Cătălin Ristea;Sebastian Braun;Solomiya Branets","doi":"10.1109/OJSP.2024.3376293","DOIUrl":"https://doi.org/10.1109/OJSP.2024.3376293","url":null,"abstract":"The ICASSP 2023 Speech Signal Improvement Challenge is intended to stimulate research in the area of improving the speech signal quality in communication systems. The speech signal quality can be measured with SIG in ITU-T P.835 and is still a top issue in audio communication and conferencing systems. For example, in the ICASSP 2022 Deep Noise Suppression challenge, the improvement in the background and overall quality is impressive, but the improvement in the speech signal is not statistically significant. To improve the speech signal the following speech impairment areas must be addressed: coloration, discontinuity, loudness, reverberation, and noise. A training and test set was provided for the challenge, and the winners were determined using an extended crowdsourced implementation of ITU-T P.804’s listening phase. The results show significant improvement was made across all measured dimensions of speech quality.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"5 ","pages":"662-674"},"PeriodicalIF":2.9,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10470433","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141448002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatial Sigma-Delta Modulation for Coarsely Quantized Massive MIMO Downlink: Flexible Designs by Convex Optimization 用于粗量化大规模多输入多输出下行链路的空间Σ-Δ调制:通过凸优化实现灵活设计
IEEE open journal of signal processing Pub Date : 2024-03-11 DOI: 10.1109/OJSP.2024.3375653
Wai-Yiu Keung;Wing-Kin Ma
{"title":"Spatial Sigma-Delta Modulation for Coarsely Quantized Massive MIMO Downlink: Flexible Designs by Convex Optimization","authors":"Wai-Yiu Keung;Wing-Kin Ma","doi":"10.1109/OJSP.2024.3375653","DOIUrl":"https://doi.org/10.1109/OJSP.2024.3375653","url":null,"abstract":"This article considers the context of multiuser massive MIMO downlink precoding with low-resolution digital-to-analog converters (DACs) at the transmitter. This subject is motivated by the consideration that it is expensive to employ high-resolution DACs for practical massive MIMO implementations. The challenge with using low-resolution DACs is to overcome the detrimental quantization error effects. Recently, spatial Sigma-Delta (\u0000<inline-formula><tex-math>$Sigma Delta$</tex-math></inline-formula>\u0000) modulation has arisen as a viable way to put quantization errors under control. This approach takes insight from temporal \u0000<inline-formula><tex-math>$Sigma Delta$</tex-math></inline-formula>\u0000 modulation in classical DAC studies. Assuming a 1D uniform linear transmit antenna array, the principle is to shape the quantization errors in space such that the shaped quantization errors are pushed away from the user-serving angle sector. In the previous studies, spatial \u0000<inline-formula><tex-math>$Sigma Delta$</tex-math></inline-formula>\u0000 modulation was performed by direct application of the basic first- and second-order modulators from the \u0000<inline-formula><tex-math>$Sigma Delta$</tex-math></inline-formula>\u0000 literature. In this paper, we develop a general \u0000<inline-formula><tex-math>$Sigma Delta$</tex-math></inline-formula>\u0000 modulator design framework for any given order, for any given number of quantization levels, and for any given angle sector. We formulate our design as a problem of maximizing the signal-to-quantization-and-noise ratios (SQNRs) experienced by the users. The formulated problem is convex and can be efficiently solved by available solvers. Our proposed framework offers the alternative option of focused quantization error suppression in accordance with channel state information. Our framework can also be extended to 2D planar transmit antenna arrays. We perform numerical study under different operating conditions, and the numerical results suggest that, given a moderate number of quantization levels, say, 5 to 7 levels, our optimization-based \u0000<inline-formula><tex-math>$Sigma Delta$</tex-math></inline-formula>\u0000 modulation schemes can lead to bit error rate performance close to that of the unquantized counterpart.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"5 ","pages":"520-538"},"PeriodicalIF":0.0,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10465600","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140621199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Representation Synthesis by Probabilistic Many-Valued Logic Operation in Self-Supervised Learning 自我监督学习中通过概率多值逻辑操作进行表征合成
IF 2.9
IEEE open journal of signal processing Pub Date : 2024-03-10 DOI: 10.1109/OJSP.2024.3399663
Hiroki Nakamura;Masashi Okada;Tadahiro Taniguchi
{"title":"Representation Synthesis by Probabilistic Many-Valued Logic Operation in Self-Supervised Learning","authors":"Hiroki Nakamura;Masashi Okada;Tadahiro Taniguchi","doi":"10.1109/OJSP.2024.3399663","DOIUrl":"https://doi.org/10.1109/OJSP.2024.3399663","url":null,"abstract":"In this paper, we propose a new self-supervised learning (SSL) method for representations that enable logic operations. Representation learning has been applied to various tasks like image generation and retrieval. The logical controllability of representations is important for these tasks. Although some methods have been shown to enable the intuitive control of representations using natural languages as the inputs, representation control via logic operations between representations has not been demonstrated. Some SSL methods using representation synthesis (e.g., elementwise mean and maximum operations) have been proposed, but the operations performed in these methods do not incorporate logic operations. In this work, we propose a logic-operable self-supervised representation learning method by replacing the existing representation synthesis with the OR operation on the probabilistic extension of many-valued logic. The representations comprise a set of feature-possession degrees, which are truth values indicating the presence or absence of each feature in the image, and realize the logic operations (e.g., OR and AND). Our method can generate a representation that has the features of both representations or only those features common to both representations. Furthermore, the expression of the ambiguous presence of a feature is realized by indicating the feature-possession degree by the probability distribution of truth values of the many-valued logic. We showed that our method performs competitively in single and multi-label classification tasks compared with prior SSL methods using synthetic representations. Moreover, experiments on image retrieval using MNIST and PascalVOC showed the representations of our method can be operated by OR and AND operations.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"5 ","pages":"831-840"},"PeriodicalIF":2.9,"publicationDate":"2024-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10528856","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141543877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PtychoDV: Vision Transformer-Based Deep Unrolling Network for Ptychographic Image Reconstruction PtychoDV:基于视觉变换器的深度解卷网络,用于双色图像重建
IEEE open journal of signal processing Pub Date : 2024-03-08 DOI: 10.1109/OJSP.2024.3375276
Weijie Gan;Qiuchen Zhai;Michael T. McCann;Cristina Garcia Cardona;Ulugbek S. Kamilov;Brendt Wohlberg
{"title":"PtychoDV: Vision Transformer-Based Deep Unrolling Network for Ptychographic Image Reconstruction","authors":"Weijie Gan;Qiuchen Zhai;Michael T. McCann;Cristina Garcia Cardona;Ulugbek S. Kamilov;Brendt Wohlberg","doi":"10.1109/OJSP.2024.3375276","DOIUrl":"https://doi.org/10.1109/OJSP.2024.3375276","url":null,"abstract":"Ptychography is an imaging technique that captures multiple overlapping snapshots of a sample, illuminated coherently by a moving localized probe. The image recovery from ptychographic data is generally achieved via an iterative algorithm that solves a nonlinear phase retrieval problem derived from measured diffraction patterns. However, these iterative approaches have high computational cost. In this paper, we introduce PtychoDV, a novel deep model-based network designed for efficient, high-quality ptychographic image reconstruction. PtychoDV comprises a vision transformer that generates an initial image from the set of raw measurements, taking into consideration their mutual correlations. This is followed by a deep unrolling network that refines the initial image using learnable convolutional priors and the ptychography measurement model. Experimental results on simulated data demonstrate that PtychoDV is capable of outperforming existing deep learning methods for this problem, and significantly reduces computational cost compared to iterative methodologies, while maintaining competitive performance.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"5 ","pages":"539-547"},"PeriodicalIF":0.0,"publicationDate":"2024-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10463649","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140621193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards a Geometric Understanding of Spatiotemporal Graph Convolution Networks 实现对时空图卷积网络的几何理解
IF 2.9
IEEE open journal of signal processing Pub Date : 2024-03-03 DOI: 10.1109/OJSP.2024.3396635
Pratyusha Das;Sarath Shekkizhar;Antonio Ortega
{"title":"Towards a Geometric Understanding of Spatiotemporal Graph Convolution Networks","authors":"Pratyusha Das;Sarath Shekkizhar;Antonio Ortega","doi":"10.1109/OJSP.2024.3396635","DOIUrl":"https://doi.org/10.1109/OJSP.2024.3396635","url":null,"abstract":"Spatiotemporal graph convolutional networks (STGCNs) have emerged as a desirable model for \u0000<italic>skeleton</i>\u0000-based human action recognition. Despite achieving state-of-the-art performance, there is a limited understanding of the representations learned by these models, which hinders their application in critical and real-world settings. While layerwise analysis of CNN models has been studied in the literature, to the best of our knowledge, there exists \u0000<italic>no study</i>\u0000 on the layerwise explainability of the embeddings learned on spatiotemporal data using STGCNs. In this paper, we first propose to use a local Dataset Graph (DS-Graph) obtained from the feature representation of input data at each layer to develop an understanding of the layer-wise embedding geometry of the STGCN. To do so, we develop a window-based dynamic time warping (DTW) method to compute the distance between data sequences with varying temporal lengths. To validate our findings, we have developed a layer-specific Spatiotemporal Graph Gradient-weighted Class Activation Mapping (L-STG-GradCAM) technique tailored for spatiotemporal data. This approach enables us to visually analyze and interpret each layer within the STGCN network. We characterize the functions learned by each layer of the STGCN using the label smoothness of the representation and visualize them using our L-STG-GradCAM approach. Our proposed method is generic and can yield valuable insights for STGCN architectures in different applications. However, this paper focuses on the human activity recognition task as a representative application. Our experiments show that STGCN models learn representations that capture general human motion in their initial layers while discriminating different actions only in later layers. This justifies experimental observations showing that fine-tuning deeper layers works well for transfer between related tasks. We provide experimental evidence for different human activity datasets and advanced spatiotemporal graph networks to validate that the proposed method is general enough to analyze any STGCN model and can be useful for drawing insight into networks in various scenarios. We also show that noise at the input has a limited effect on label smoothness, which can help justify the robustness of STGCNs to noise.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"5 ","pages":"1023-1030"},"PeriodicalIF":2.9,"publicationDate":"2024-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10518107","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142316397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Attention-Based End-to-End Differentiable Particle Filter for Audio Speaker Tracking 用于音频扬声器跟踪的基于注意力的端到端可微粒滤波器
IEEE open journal of signal processing Pub Date : 2024-02-08 DOI: 10.1109/OJSP.2024.3363649
Jinzheng Zhao;Yong Xu;Xinyuan Qian;Haohe Liu;Mark D. Plumbley;Wenwu Wang
{"title":"Attention-Based End-to-End Differentiable Particle Filter for Audio Speaker Tracking","authors":"Jinzheng Zhao;Yong Xu;Xinyuan Qian;Haohe Liu;Mark D. Plumbley;Wenwu Wang","doi":"10.1109/OJSP.2024.3363649","DOIUrl":"https://doi.org/10.1109/OJSP.2024.3363649","url":null,"abstract":"Particle filters (PFs) have been widely used in speaker tracking due to their capability in modeling a non-linear process or a non-Gaussian environment. However, particle filters are limited by several issues. For example, pre-defined handcrafted measurements are often used which can limit the model performance. In addition, the transition and update models are often preset which make PF less flexible to be adapted to different scenarios. To address these issues, we propose an end-to-end differentiable particle filter framework by employing the multi-head attention to model the long-range dependencies. The proposed model employs the self-attention as the learned transition model and the cross-attention as the learned update model. To our knowledge, this is the first proposal of combining particle filter and transformer for speaker tracking, where the measurement extraction, transition and update steps are integrated into an end-to-end architecture. Experimental results show that the proposed model achieves superior performance over the recurrent baseline models.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"5 ","pages":"449-458"},"PeriodicalIF":0.0,"publicationDate":"2024-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10428039","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139976169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient Channel-Temporal Attention for Boosting RF Fingerprinting 提升射频指纹识别的高效信道时空注意力
IEEE open journal of signal processing Pub Date : 2024-02-06 DOI: 10.1109/OJSP.2024.3362695
Hanqing Gu;Lisheng Su;Yuxia Wang;Weifeng Zhang;Chuan Ran
{"title":"Efficient Channel-Temporal Attention for Boosting RF Fingerprinting","authors":"Hanqing Gu;Lisheng Su;Yuxia Wang;Weifeng Zhang;Chuan Ran","doi":"10.1109/OJSP.2024.3362695","DOIUrl":"https://doi.org/10.1109/OJSP.2024.3362695","url":null,"abstract":"In recent years, Deep Convolutional Neural Networks (DCNNs) have been widely used to solve Radio Frequency (RF) fingerprinting task. DCNNs are capable of learning the proper convolution kernels driven by data and directly extracting RF fingerprints from raw In-phase/Quadratur (IQ) data which are brought by variations or minor flaws in transmitters' circuits, enabling the identification of a specific transmitter. One of the main challenges in employing this sort of technology is how to optimize model design so that it can automatically learn discriminative RF fingerprints and show robustness to changes in environmental factors. To this end, this paper proposes \u0000<italic>ECTAttention</i>\u0000, an \u0000<bold>E</b>\u0000fficient \u0000<bold>C</b>\u0000hannel-\u0000<bold>T</b>\u0000emporal \u0000<bold>A</b>\u0000ttention block that can be used to enhance the feature learning capability of DCNNs. \u0000<italic>ECTAttention</i>\u0000 has two parallel branches. On the one hand, it automatically mines the correlation between channels through channel attention to discover and enhance important convolution kernels. On the other hand, it can recalibrate the feature map through temporal attention. \u0000<italic>ECTAttention</i>\u0000 has good flexibility and high efficiency, and can be combined with existing DCNNs to effectively enhance their feature learning ability on the basis of increasing only a small amount of computational consumption, so as to achieve high precision of RF fingerprinting. Our experimental results show that ResNet enhanced by \u0000<italic>ECTAttention</i>\u0000 can identify 10 USRP X310 SDRs with an accuracy of 97.5%, and achieve a recognition accuracy of 91.9% for 56 actual ADS-B signal sources under unconstrained acquisition environment.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"5 ","pages":"478-492"},"PeriodicalIF":0.0,"publicationDate":"2024-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10423213","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139987098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信