IEEE Signal Processing Letters最新文献_第9页

Integrated DNN-Based Parameter Estimation for Multichannel Speech Enhancement 基于集成dnn的多通道语音增强参数估计

IF 3.9 2区工程技术

IEEE Signal Processing Letters Pub Date : 2025-08-14 DOI: 10.1109/LSP.2025.3599455

Sein Cheong;Minseung Kim;Jong Won Shin

{"title":"Integrated DNN-Based Parameter Estimation for Multichannel Speech Enhancement","authors":"Sein Cheong;Minseung Kim;Jong Won Shin","doi":"10.1109/LSP.2025.3599455","DOIUrl":"https://doi.org/10.1109/LSP.2025.3599455","url":null,"abstract":"One of the popular configurations for the statistical model-based multichannel speech enhancement (SE) is to apply a spatial filter such as the minimum-variance distortionless response beamformer followed by a single channel post-filter, and some of the deep neural network (DNN)-based approaches mimic it. While a number of DNN-based SE focused on direct estimation of clean speech features or the masks to estimate clean speech, some of the efforts were devoted to estimate the statistical parameters. DNN-based parameter estimation with two DNNs for a beamforming stage and a post-filtering stage has demonstrated impressive performance, but the parameter estimation for a beamformer and that for a post-filter operate separately, which may not be optimal in that the post-filter cannot utilize spatial information from multi-microphone signals. In this letter, we propose integrated DNN-based parameter estimation for multichannel SE based on both the beamformer output and multi-microphone signals. The speech presence probability and the power spectral densities for speech and noise estimated in the beamforming stage are utilized in the post-filtering stage for better parameter estimation. We also adopt the dual-path conformer structure with an encoder and decoders to enhance the performance. Experimental results show that the proposed method marked the best wideband perceptual evaluation of speech quality (PESQ) scores on the CHiME-4 dataset among all methods with comparable computational complexity.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"3320-3324"},"PeriodicalIF":3.9,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144909220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Learning Weakly Monotone Operators for Convergent Plug-and-Play PET Reconstruction 收敛即插即用PET重构的弱单调算子学习

IF 3.9 2区工程技术

IEEE Signal Processing Letters Pub Date : 2025-08-13 DOI: 10.1109/LSP.2025.3598700

Marion Savanier;Claude Comtat;Florent Sureau

引用次数: 0

TO-LF: A Texture and Occlusion-Oriented Benchmark Dataset for Light Field Disparity Estimation 面向纹理和遮挡的光场视差估计基准数据集TO-LF

IF 3.9 2区工程技术

IEEE Signal Processing Letters Pub Date : 2025-08-13 DOI: 10.1109/LSP.2025.3598728

Shubo Zhou;Yunlong Wang;Yingqian Wang;Fei Liu;Xue-qin Jiang

引用次数: 0

Large Language Models Can Achieve Explainable and Training-Free One-Shot HRRP ATR 大型语言模型可以实现可解释且无需训练的一次性HRRP ATR

IF 3.9 2区工程技术

IEEE Signal Processing Letters Pub Date : 2025-08-12 DOI: 10.1109/LSP.2025.3598220

Lingfeng Chen;Panhe Hu;Zhiliang Pan;Qi Liu;Shuanghui Zhang;Zhen Liu

引用次数: 0

A Technical Odyssey of Self-Supervised Representation Learning for Devanagari-Script-Based P300 Speller 基于devanagari - script的P300拼写器的自监督表示学习的技术奥德赛

IF 3.9 2区工程技术

IEEE Signal Processing Letters Pub Date : 2025-08-11 DOI: 10.1109/LSP.2025.3597875

Vibha Bhandari;Narendra D. Londhe;Ghanahshyam B. Kshirsagar

{"title":"A Technical Odyssey of Self-Supervised Representation Learning for Devanagari-Script-Based P300 Speller","authors":"Vibha Bhandari;Narendra D. Londhe;Ghanahshyam B. Kshirsagar","doi":"10.1109/LSP.2025.3597875","DOIUrl":"https://doi.org/10.1109/LSP.2025.3597875","url":null,"abstract":"Traditional supervised learning (SL) methods for P300 event-related potential (ERP) detection in P300 spellers require extensive labelled data and often struggle to generalize well across subjects and trials, especially with limited data. Previous efforts using transfer learning and knowledge distillation improved performance but still face high computational complexity and lack transparency. These issues highlight the need to explore new approaches to enhance transferability and reduce uncertainty. To address this, we investigated the effectiveness of representational learning through a self-supervised approach. Our self-supervised learning (SSL) framework, featuring a compact convolutional neural network (CNN) backbone and label-agnostic characteristics, improves the robustness of learned features to variations in ERPs encountered in P300 speller. Experiments on self-recorded data and ablation studies show that the learned representations are robust and effective. Achieving an accuracy of 84%, the downstream classifier trained on the SSL framework performed competitively with traditional supervised methods. Additionally, comparison between features learned with SL and SSL, using t-SNE visualization and correlation coefficient (r = -0.51) analysis, demonstrates that SSL features offer better discrimination between P300 and non-P300.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"3420-3424"},"PeriodicalIF":3.9,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145057458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Quaternion Wavelet-Driven Multi-Scale Feature Interaction Network for Color Image Denoising 四元数小波驱动的多尺度特征交互网络彩色图像去噪

IF 3.9 2区工程技术

IEEE Signal Processing Letters Pub Date : 2025-08-11 DOI: 10.1109/LSP.2025.3597878

Shan Gai;Yihao Wu;Shiguang Lu

{"title":"Quaternion Wavelet-Driven Multi-Scale Feature Interaction Network for Color Image Denoising","authors":"Shan Gai;Yihao Wu;Shiguang Lu","doi":"10.1109/LSP.2025.3597878","DOIUrl":"https://doi.org/10.1109/LSP.2025.3597878","url":null,"abstract":"Real-valued wavelets have achieved great success in image denoising due to their sparse representation capability under multi-scale analysis. However, existing real-valued wavelets suffer from limited directional selectivity and translation sensitivity, which can lead to color distortion and loss of phase information. The quaternion wavelet transform (QWT) offers a new solution by extending each pair of complex filters in the dual-tree complex wavelet transform to quaternion-valued filter banks, generating quaternion high frequency subbands in three principal directions while retaining a low frequency approximation, thus achieving cross channel translation invariance and phase consistency. Based on this, we propose a QWT-driven multi-scale feature interaction network (QMFINet). QMFINet leverages QWT to extract cross channel structured phase features at the same spatial locations, precisely linking color and texture details; it further employs a three-path feature extraction module (TPFEM) to capture multi-scale representations. To effectively fuse features at different resolutions, we design a quaternion ordered channel attention subnet (QOCAS). Experimental results demonstrate that QMFINet outperforms several state-of-the-art color image denoising methods across a range of noise levels, and achieves the best performance at <inline-formula><tex-math>$sigma =75$</tex-math></inline-formula>, with an average PSNR improvement of approximately 0.3-0.4dB over the previous state-of-the-art method.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"3425-3429"},"PeriodicalIF":3.9,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145021333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Chroma Subsampling for Enhanced Geometry-Based Point Cloud Compression 基于增强几何的点云压缩的色度子采样

IF 3.9 2区工程技术

IEEE Signal Processing Letters Pub Date : 2025-08-11 DOI: 10.1109/LSP.2025.3597874

Zehan Wang;Yuxuan Wei;Jongseok Lee;Hyejung Hur;Hui Yuan

引用次数: 0

Fast and Provable Low-Rank High-Order Tensor Completion via Scaled Gradient Descent 通过缩放梯度下降快速且可证明的低秩高阶张量补全

IF 3.9 2区工程技术

IEEE Signal Processing Letters Pub Date : 2025-08-11 DOI: 10.1109/LSP.2025.3597829

Tong Wu;Fang Zhang

引用次数: 0

Location-Aided Maximal Ratio Combining for an Acoustic Vector Sensor in Multipath Channels 多径通道声矢量传感器的位置辅助最大比值组合

IF 3.9 2区工程技术

IEEE Signal Processing Letters Pub Date : 2025-08-11 DOI: 10.1109/LSP.2025.3597557

Xinghao Qu;Zhigang Shang;Gang Qiao;Yiwen Zhou

引用次数: 0

Deformable Locality-Coordination Graph Motifs for 3D Skeleton Based Person Re-Identification 基于三维骨骼的人再识别的可变形位置-协调图基

IF 3.9 2区工程技术

IEEE Signal Processing Letters Pub Date : 2025-08-11 DOI: 10.1109/LSP.2025.3597562

Haocong Rao;Chunyan Miao

{"title":"Deformable Locality-Coordination Graph Motifs for 3D Skeleton Based Person Re-Identification","authors":"Haocong Rao;Chunyan Miao","doi":"10.1109/LSP.2025.3597562","DOIUrl":"https://doi.org/10.1109/LSP.2025.3597562","url":null,"abstract":"Existing 3D skeleton based person re-identification (re-ID) approaches typically model skeletons as graphs to capture body relations and motion. However, they often rely on <italic>fixed joint’s connections such as adjacency for relation modeling, while lacking a flexible and specific focus on key body joints or parts of <italic>different levels to capture various local relations (<italic>“locality”) and limb relations (<italic>“coordination”). In this letter, we propose Deformable Locality-Coordination graph Motifs (DL-CM) that can guide the body relation learning to particularly capture multi-order <italic>locality and <italic>coordination of key gait-specific body parts to enhance person re-ID performance. Specifically, we first devise Deformable Locality Motifs (DLM), which are applicable to deformed skeleton graphs at different levels, to simultaneously focus on different-order neighbors’ relations for body structure and pattern learning. Then, we propose Deformable Coordination Motifs (DCM) to concurrently capture local and global coordination of different-level limbs in deformed graphs, so as to facilitate learning discriminative gait patterns for person re-ID. Extensive experiments on four public benchmarks demonstrate the effectiveness of DL-CM on state-of-the-art models and different-level graph representations to improve person re-ID performance.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"3655-3659"},"PeriodicalIF":3.9,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145210038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0