IEEE Signal Processing Letters最新文献

筛选
英文 中文
IrisFormer: A Dedicated Transformer Framework for Iris Recognition IrisFormer:虹膜识别专用变压器框架
IF 3.2 2区 工程技术
IEEE Signal Processing Letters Pub Date : 2024-12-26 DOI: 10.1109/LSP.2024.3522856
Xianyun Sun;Caiyong Wang;Yunlong Wang;Jianze Wei;Zhenan Sun
{"title":"IrisFormer: A Dedicated Transformer Framework for Iris Recognition","authors":"Xianyun Sun;Caiyong Wang;Yunlong Wang;Jianze Wei;Zhenan Sun","doi":"10.1109/LSP.2024.3522856","DOIUrl":"https://doi.org/10.1109/LSP.2024.3522856","url":null,"abstract":"While Vision Transformer (ViT)-based methods have significantly improved the performance of various vision tasks in natural scenes, progress in iris recognition remains limited. In addition, the human iris contains unique characters that are distinct from natural scenes. To remedy this, this paper investigates a dedicated Transformer framework, termed IrisFormer, for iris recognition and attempts to improve the accuracy by combining the contextual modeling ability of ViT and iris-specific optimization to learn robust, fine-grained, and discriminative features. Specifically, to achieve rotation invariance in iris recognition, we employ relative position encoding instead of regular absolute position encoding for each iris image token, and a horizontal pixel-shifting strategy is utilized during training for data augmentation. Then, to enhance the model's robustness against local distortions such as occlusions and reflections, we randomly mask some tokens during training to force the model to learn representative identity features from only part of the image. Finally, considering that fine-grained features are more discriminative in iris recognition, we retain the entire token sequence for patch-wise feature matching instead of using the standard single classification token. Experiments on three popular datasets demonstrate that the proposed framework achieves competitive performance under both intra- and inter-dataset testing protocols.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"431-435"},"PeriodicalIF":3.2,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142925368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Synchronous and Asynchronous HARQ-CC Assisted SCMA Schemes 同步和异步HARQ-CC辅助SCMA方案
IF 3.2 2区 工程技术
IEEE Signal Processing Letters Pub Date : 2024-12-26 DOI: 10.1109/LSP.2024.3523227
Man Wang;Zheng Shi;Yunfei Li;Xianda Wu;Weiqiang Tan
{"title":"Synchronous and Asynchronous HARQ-CC Assisted SCMA Schemes","authors":"Man Wang;Zheng Shi;Yunfei Li;Xianda Wu;Weiqiang Tan","doi":"10.1109/LSP.2024.3523227","DOIUrl":"https://doi.org/10.1109/LSP.2024.3523227","url":null,"abstract":"This letter proposes a novel hybrid automatic repeat request with chase combining assisted sparse code multiple access (HARQ-CC-SCMA) scheme. Depending on whether the same superimposed packet is retransmitted, synchronous and asynchronous modes are considered for retransmissions. Moreover, a factor graph aggregation (FGA) method is used for multi-user detection. Specifically, a large-scale factor graph is constructed by combining all the received superimposed signals and message passing algorithm (MPA) is applied to calculate log-likelihood ratio (LLR). Monte Carlo simulations are preformed to show that FGA surpasses bit-level combining (BLC) and HARQ with incremental redundancy (HARQ-IR) in synchronous mode. Moreover, FGA performs better than BLC at high signal-to-noise ratio (SNR) region in asynchronous mode. However, FGA in asynchronous mode is worse than BLC at low SNR, because significant error propagation is induced by the presence of failed messages after the maximum allowable HARQ rounds.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"506-510"},"PeriodicalIF":3.2,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142937891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning Dynamic Distractor-Repressed Correlation Filter for Real-Time UAV Tracking 用于无人机实时跟踪的学习动态干扰抑制相关滤波器
IF 3.2 2区 工程技术
IEEE Signal Processing Letters Pub Date : 2024-12-26 DOI: 10.1109/LSP.2024.3522850
Zhi Chen;Lijun Liu;Zhen Yu
{"title":"Learning Dynamic Distractor-Repressed Correlation Filter for Real-Time UAV Tracking","authors":"Zhi Chen;Lijun Liu;Zhen Yu","doi":"10.1109/LSP.2024.3522850","DOIUrl":"https://doi.org/10.1109/LSP.2024.3522850","url":null,"abstract":"With high-efficiency computing advantages and desirable tracking accuracy, discriminative correlation filters (DCFs) have been widely utilized in UAV tracking, leading to substantial progress. However, in some intricate scenarios (e.g., similar objects or backgrounds, background clutter), DCF-based trackers are prone to generating low-reliability response maps influenced by surrounding response distractors, thereby reducing tracking robustness. Furthermore, the limited computational resources and endurance on UAV platforms drive DCF-based trackers to exhibit real-time and reliable tracking performance. To address the aforementioned issues, a dynamic distractor-repressed correlation filter (DDRCF) is proposed. First, a dynamic distractor-repressed regularization is introduced into the DCF framework. Then, a new objective function is formulated to tune the penalty intensity of the distractor-repressed regularization module. Furthermore, a novel response map variation evaluation mechanism is used to dynamically tune the distractor-repressed regularization coefficient to adapt to omnipresent appearance variations. Considerable and exhaustive experiments on four prevailing UAV benchmarks, i.e., UAV123@10fps, UAVTrack112, DTB70 and UAVDT, validate that the proposed DDRCF tracker is superior to other state-of-the-art trackers. Moreover, the proposed method can achieve a tracking speed of 59 FPS on a CPU, meeting the requirements of real-time aerial tracking.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"616-620"},"PeriodicalIF":3.2,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142993444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Infrared Small Target Detection via Local-Global Feature Fusion 基于局部-全局特征融合的红外小目标检测
IF 3.2 2区 工程技术
IEEE Signal Processing Letters Pub Date : 2024-12-26 DOI: 10.1109/LSP.2024.3523226
Lang Wu;Yong Ma;Fan Fan;Jun Huang
{"title":"Infrared Small Target Detection via Local-Global Feature Fusion","authors":"Lang Wu;Yong Ma;Fan Fan;Jun Huang","doi":"10.1109/LSP.2024.3523226","DOIUrl":"https://doi.org/10.1109/LSP.2024.3523226","url":null,"abstract":"Due to the high-luminance (HL) background clutter in infrared (IR) images, the existing IR small target detection methods struggle to achieve a good balance between efficiency and performance. Addressing the issue of HL clutter, which is difficult to suppress, leading to a high false alarm rate, this letter proposes an IR small target detection method based on local-global feature fusion (LGFF). We develop a fast and efficient local feature extraction operator and utilize global rarity to characterize the global feature of small targets, effectively suppressing a significant amount of HL clutter. By integrating local and global features, we achieve further enhancement of the targets and robust suppression of the clutter. Experimental results demonstrate that the proposed method outperforms existing methods in terms of target enhancement, clutter removal, and real-time performance.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"466-470"},"PeriodicalIF":3.2,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142937863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GALD-SE: Guided Anisotropic Lightweight Diffusion for Efficient Speech Enhancement GALD-SE:有效语音增强的引导各向异性轻量扩散
IF 3.2 2区 工程技术
IEEE Signal Processing Letters Pub Date : 2024-12-26 DOI: 10.1109/LSP.2024.3522852
Chengzhong Wang;Jianjun Gu;Dingding Yao;Junfeng Li;Yonghong Yan
{"title":"GALD-SE: Guided Anisotropic Lightweight Diffusion for Efficient Speech Enhancement","authors":"Chengzhong Wang;Jianjun Gu;Dingding Yao;Junfeng Li;Yonghong Yan","doi":"10.1109/LSP.2024.3522852","DOIUrl":"https://doi.org/10.1109/LSP.2024.3522852","url":null,"abstract":"Speech enhancement is designed to enhance the intelligibility and quality of speech across diverse noise conditions. Recently, diffusion models have gained lots of attention in speech enhancement area, achieving competitive results. Current diffusion-based methods blur the distribution of the signal with isotropic Gaussian noise and recover clean speech distribution from the prior. However, these methods often suffer from a substantial computational burden. We argue that the computational inefficiency partially stems from the oversight that speech enhancement is not purely a generative task; it primarily involves noise reduction and completion of missing information, while the clean clues in the original mixture do not need to be regenerated. In this paper, we propose a method that introduces noise with anisotropic guidance during the diffusion process, allowing the neural network to preserve clean clues within noisy recordings. This approach substantially reduces computational complexity while exhibiting robustness against various forms of noise and speech distortion. Experiments demonstrate that the proposed method achieves state-of-the-art results with only approximately 4.5 million parameters, a number significantly lower than that required by other diffusion methods. This effectively narrows the model size disparity between diffusion-based and predictive speech enhancement approaches. Additionally, the proposed method performs well in very noisy scenarios, demonstrating its potential for applications in highly challenging environments.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"426-430"},"PeriodicalIF":3.2,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142925372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing No-Reference Audio-Visual Quality Assessment via Joint Cross-Attention Fusion 通过关节交叉注意融合增强无参考视听质量评估
IF 3.2 2区 工程技术
IEEE Signal Processing Letters Pub Date : 2024-12-25 DOI: 10.1109/LSP.2024.3522855
Zhaolin Wan;Xiguang Hao;Xiaopeng Fan;Wangmeng Zuo;Debin Zhao
{"title":"Enhancing No-Reference Audio-Visual Quality Assessment via Joint Cross-Attention Fusion","authors":"Zhaolin Wan;Xiguang Hao;Xiaopeng Fan;Wangmeng Zuo;Debin Zhao","doi":"10.1109/LSP.2024.3522855","DOIUrl":"https://doi.org/10.1109/LSP.2024.3522855","url":null,"abstract":"As the consumption of multimedia content continues to rise, audio and video have become central to everyday entertainment and social interactions. This growing reliance amplifies the demand for effective and objective audio-visual quality assessment (AVQA) to understand the interaction between audio and visual elements, ultimately enhancing user satisfaction. However, existing state-of-the-art AVQA methods often rely on simplistic machine learning models or fully connected networks for audio-visual signal fusion, which limits their ability to exploit the complementary nature of these modalities. In response to this gap, we propose a novel no-reference AVQA method that utilizes joint cross-attention fusion of audio-visual perception. Our approach begins with a dual-stream feature extraction process that simultaneously captures long-range spatiotemporal visual features and audio features. The fusion model then dynamically adjusts the contributions of features from both modalities, effectively integrating them to provide a more comprehensive perception for quality score prediction. Experimental results on the LIVE-SJTU and UnB-AVC datasets demonstrate that our model outperforms state-of-the-art methods, achieving superior performance in audio-visual quality assessment.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"556-560"},"PeriodicalIF":3.2,"publicationDate":"2024-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142993081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Outlier Indicator Based Projection Fuzzy K-Means Clustering for Hyperspectral Image 基于离群指标的高光谱图像投影模糊k均值聚类
IF 3.2 2区 工程技术
IEEE Signal Processing Letters Pub Date : 2024-12-25 DOI: 10.1109/LSP.2024.3521714
Xinze Liu;Xiaojun Yang;Jiale Zhang;Jing Wang;Feiping Nie
{"title":"Outlier Indicator Based Projection Fuzzy K-Means Clustering for Hyperspectral Image","authors":"Xinze Liu;Xiaojun Yang;Jiale Zhang;Jing Wang;Feiping Nie","doi":"10.1109/LSP.2024.3521714","DOIUrl":"https://doi.org/10.1109/LSP.2024.3521714","url":null,"abstract":"The application of hyperspectral image (HSI) clustering has become widely used in the field of remote sensing. Traditional fuzzy K-means clustering methods often struggle with HSI data due to the significant levels of noise, consequently resulting in segmentation inaccuracies. To address this limitation, this letter introduces an innovative outlier indicator-based projection fuzzy K-means clustering (OIPFK) algorithm for clustering of HSI data, enhancing the efficacy and robustness of previous fuzzy K-means methodologies through a two-pronged strategy. Initially, an outlier indicator vector is constructed to identify noise and outliers by computing the distances between each data point in a reduced dimensional space. Subsequently, the OIPFK algorithm incorporates the fuzzy membership relationships between samples and clustering centers within this lower-dimensional framework, along with the integration of the outlier indicator vectors, to significantly mitigates the influence of noise and extraneous features. Moreover, an efficient iterative optimization algorithm is employed to address the optimization challenges inherent to OIPKM. Experimental results from three real-world hyperspectral image datasets demonstrate the effectiveness and superiority of our proposed method.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"496-500"},"PeriodicalIF":3.2,"publicationDate":"2024-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142937895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Interbeat Interval Filtering 拍间间隔过滤
IF 3.2 2区 工程技术
IEEE Signal Processing Letters Pub Date : 2024-12-25 DOI: 10.1109/LSP.2024.3522853
İlker Bayram
{"title":"Interbeat Interval Filtering","authors":"İlker Bayram","doi":"10.1109/LSP.2024.3522853","DOIUrl":"https://doi.org/10.1109/LSP.2024.3522853","url":null,"abstract":"Several inhibitory and excitatory factors regulate the beating of the heart. Consequently, the interbeat intervals (IBIs) vary around a mean value. Various statistics have been proposed to capture heart rate variability (HRV) to give a glimpse into this balance. However, these statistics require accurate estimation of IBIs as a first step, which can be challenging especially for signals recorded in ambulatory conditions. We propose a lightweight state-space filter that models the IBIs as samples of an inverse Gaussian distribution with time-varying parameters. We make the filter robust against outliers by adapting the probabilistic data association filter to the setup. We demonstrate that the resulting filter can accurately identify outliers and the parameters of the tracked distribution can be used to compute a specific HRV statistic (standard deviation of normal-to-normal intervals, SDNN) without further analysis.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"481-485"},"PeriodicalIF":3.2,"publicationDate":"2024-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142938329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
STSPhys: Enhanced Remote Heart Rate Measurement With Spatial-Temporal SwiftFormer 基于时空SwiftFormer的增强远程心率测量
IF 3.2 2区 工程技术
IEEE Signal Processing Letters Pub Date : 2024-12-25 DOI: 10.1109/LSP.2024.3522854
Hyunduk Kim;Sang-Heon Lee;Myoung-Kyu Sohn;Jungkwang Kim;Hyeyoung Park
{"title":"STSPhys: Enhanced Remote Heart Rate Measurement With Spatial-Temporal SwiftFormer","authors":"Hyunduk Kim;Sang-Heon Lee;Myoung-Kyu Sohn;Jungkwang Kim;Hyeyoung Park","doi":"10.1109/LSP.2024.3522854","DOIUrl":"https://doi.org/10.1109/LSP.2024.3522854","url":null,"abstract":"Estimating heart activities and physiological signals from facial video without any contact, known as remote photoplethysmography and remote heart rate estimation, holds significant potential for numerous applications. In this letter, we present a novel approach for remote heart rate measurement leveraging a Spatial-Temporal SwiftFormer architecture (STSPhys). Our model addresses the limitations of existing methods that rely heavily on 3D CNNs or 3D visual transformers, which often suffer from increased parameters and potential instability during training. By integrating both spatial and temporal information from facial video data, STSPhys achieves robust and accurate heart rate estimation. Additionally, we introduce a hybrid loss function that integrates constraints from both the time and frequency domains, further enhancing the model's accuracy. Experimental results demonstrate that STSPhys significantly outperforms existing state-of-the-art methods on intra-dataset and cross-dataset tests, achieving superior performance with fewer parameters and lower computational complexity.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"521-525"},"PeriodicalIF":3.2,"publicationDate":"2024-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142976113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive Surveillance Video Compression With Background Hyperprior 基于背景超先验的自适应监控视频压缩
IF 3.2 2区 工程技术
IEEE Signal Processing Letters Pub Date : 2024-12-25 DOI: 10.1109/LSP.2024.3521663
Yu Zhao;Song Tang;Mao Ye
{"title":"Adaptive Surveillance Video Compression With Background Hyperprior","authors":"Yu Zhao;Song Tang;Mao Ye","doi":"10.1109/LSP.2024.3521663","DOIUrl":"https://doi.org/10.1109/LSP.2024.3521663","url":null,"abstract":"Neural surveillance video compression methods have demonstrated significant improvements over traditional video compression techniques. In current surveillance video compression frameworks, the first frame in a Group of Pictures (GOP) is usually compressed fully as an I frame, and the subsequent P frames are compressed by referencing this I frame at Low Delay P (LDP) encoding mode. However, this compression approach overlooks the utilization of background information, which limits its adaptability to different scenarios. In this paper, we propose a novel Adaptive Surveillance Video Compression framework based on background hyperprior, dubbed as ASVC. This background hyperprior is related with side information to assist in coding both the temporal and spatial domains. Our method mainly consists of two components. First, the background information from a GOP is extracted, modeled as hyperprior and is compressed by exiting methods. Then these hyperprior is used as side information to compress both I frames and P frames. ASVC effectively captures the temporal dependencies in the latent representations of surveillance videos by leveraging background hyperprior for auxiliary video encoding. The experimental results demonstrate that applying ASVC to traditional and learning based methods significantly improves performance.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"456-460"},"PeriodicalIF":3.2,"publicationDate":"2024-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142938249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信