IEEE Signal Processing Letters最新文献

筛选
英文 中文
Hardware-Decoder-Friendly High Throughput String Prediction for SCC Implemented in AVS3
IF 3.2 2区 工程技术
IEEE Signal Processing Letters Pub Date : 2025-02-25 DOI: 10.1109/LSP.2025.3545284
Liping Zhao;Zhuge Yan;Zongda Wu;Jiangda Wang;Tao Lin
{"title":"Hardware-Decoder-Friendly High Throughput String Prediction for SCC Implemented in AVS3","authors":"Liping Zhao;Zhuge Yan;Zongda Wu;Jiangda Wang;Tao Lin","doi":"10.1109/LSP.2025.3545284","DOIUrl":"https://doi.org/10.1109/LSP.2025.3545284","url":null,"abstract":"String prediction (SP) is a highly efficient screen content coding technique adopted into international and China video coding standards. However, SP requires a high number of SRAM fetches to decode and output a block for display, leading to low throughput (T). Low T results in a high decoder and SRAM clock frequency to output the required number of display pixels, which is determined by the specific display resolution and frame rate. To achieve hardware-decoder-friendly high throughput SP (HTSP), this paper exploits specific SRAM fetch rate constraints for five SRAM-cell sizes commonly used in hardware decoder designs. Additionally, the optimal reference string selection process is formulated as a multi-constraint rate-distortion optimization (MCRDO) problem and a novel reference string searching method is presented. HTSP boosts throughput by up to 4 times compared to the state-of-the- art SP, with only a negligible impact on coding efficiency.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"1166-1170"},"PeriodicalIF":3.2,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143667630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
OSFusion: A One-Stream Infrared and Visible Image Fusion Framework
IF 3.2 2区 工程技术
IEEE Signal Processing Letters Pub Date : 2025-02-25 DOI: 10.1109/LSP.2025.3545293
Shengjia An;Zhi Li;Shaorong Zhang;Yongjun Wang;Bineng Zhong
{"title":"OSFusion: A One-Stream Infrared and Visible Image Fusion Framework","authors":"Shengjia An;Zhi Li;Shaorong Zhang;Yongjun Wang;Bineng Zhong","doi":"10.1109/LSP.2025.3545293","DOIUrl":"https://doi.org/10.1109/LSP.2025.3545293","url":null,"abstract":"The current popular two-stream two-stage image fusion framework extracts features of infrared and visible images separately and then performs feature fusion. The extracted features lack interaction between the source images and have limited cross-modal complementary capability. To address these issues, we propose a novel one-stream infrared and visible image fusion (OSFusion) framework that connects a source image pair to achieve bidirectional information flow. In this way, the fused features with cross-modal complementary information can be dynamically extracted by mutual guidance. To further improve the inference efficiency and obtain high-quality fused images, a feature extraction and fusion module (FEFM) is proposed based on Transformer structure. The combination of feature extraction and feature fusion is realized by using it. Since there is no need for an extra feature interaction module and the implementation is highly parallel, the speed of image fusion is extremely fast. Benefiting from the one-stream structure and FEFM, OSFusion achieves promising infrared and visible image fusion performance on MSRS, M3FD, and RoadScene datasets. Besides, our method achieves a good balance in the trade-off between performance and complexity, and also shows a faster convergence trend.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"1086-1090"},"PeriodicalIF":3.2,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143629599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning Robust Representations by Autoencoders With Dynamical Implicit Mapping
IF 3.2 2区 工程技术
IEEE Signal Processing Letters Pub Date : 2025-02-24 DOI: 10.1109/LSP.2025.3544969
Jianda Zeng;Weili Jiang;Zhang Yi;Yong-Guo Shi;Jianyong Wang
{"title":"Learning Robust Representations by Autoencoders With Dynamical Implicit Mapping","authors":"Jianda Zeng;Weili Jiang;Zhang Yi;Yong-Guo Shi;Jianyong Wang","doi":"10.1109/LSP.2025.3544969","DOIUrl":"https://doi.org/10.1109/LSP.2025.3544969","url":null,"abstract":"Autoencoder is an unsupervised neural network that learns effective representations of data and has wide applications in feature learning, data compression, etc. However, Autoencoder is very sensitive to noise, resulting in low generalization and robustness of the model. To solve this problem, we propose a stable and efficient Autoencoder model called nmFunc-Autoencoder. Inspired by the Neural Memory Ordinary Differential Equation, the Neural Memory Activation Function uses its excellent dynamic nonlinear implicit mapping to establish a mapping relationship between external inputs and stable values to ensure the stability of distinguishable feature extraction, thereby performing better robustness when subjected to noise attacks. We conduct robustness experiments to evaluate its performance. The result showed that compared with other Autoencoder models, the data features extracted by the proposed model are more robust. Subsequently, in the execution efficiency experiments and ablation study, the model was shown to be low-cost and effective.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"1056-1060"},"PeriodicalIF":3.2,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143627848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Segment Anything for Visual Bird Sound Denoising
IF 3.2 2区 工程技术
IEEE Signal Processing Letters Pub Date : 2025-02-24 DOI: 10.1109/LSP.2025.3545005
Chenxi Zhou;Tianjiao Wan;Kele Xu;Peng Qiao;Yong Dou
{"title":"Segment Anything for Visual Bird Sound Denoising","authors":"Chenxi Zhou;Tianjiao Wan;Kele Xu;Peng Qiao;Yong Dou","doi":"10.1109/LSP.2025.3545005","DOIUrl":"https://doi.org/10.1109/LSP.2025.3545005","url":null,"abstract":"Current audio denoising methods perform well with synthetic noise but struggle with complex natural noise, especially for bird sounds, which contain natural environmental sounds such as wind and rain, making it challenging to extract clean bird sounds. This issue becomes more pronounced with short and faint bird sounds, where existing methods are less effective. In this paper, we introduce <bold>BudSAM</b>, a novel audio denoising model that incorporates the <bold>Segment Anything Model (SAM)</b>, originally designed for image segmentation task, into the field of visual bird sound denoising. By treating audio denoising as a segmentation task, BudSAM utilizes SAM's powerful segmentation capabilities and we incorporates BCE and Dice losses to enhance the model's ability to segment weak signals, effectively isolating the clean bird sounds that are often masked by background noise. Our method is evaluated on the BirdSoundsDenoising dataset, achieving a 4.0% improvement in IoU and a 0.77 dB increase in SDR compared to state-of-the-art methods. To the best knowledge of the authors, BudSAM marks the first attempt which employs SAM in audio denoising task, offering a promising direction for future research and real-world bird sound processing tasks.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"1076-1080"},"PeriodicalIF":3.2,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143629596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimizing a-DCF for Spoofing-Robust Speaker Verification
IF 3.2 2区 工程技术
IEEE Signal Processing Letters Pub Date : 2025-02-24 DOI: 10.1109/LSP.2025.3545290
Oğuzhan Kurnaz;Jagabandhu Mishra;Tomi H. Kinnunen;Cemal Hanilçi
{"title":"Optimizing a-DCF for Spoofing-Robust Speaker Verification","authors":"Oğuzhan Kurnaz;Jagabandhu Mishra;Tomi H. Kinnunen;Cemal Hanilçi","doi":"10.1109/LSP.2025.3545290","DOIUrl":"https://doi.org/10.1109/LSP.2025.3545290","url":null,"abstract":"Automatic speaker verification (ASV) systems are vulnerable to spoofing attacks. We propose a spoofing-robust ASV system optimized directly for the recently introduced architecture-agnostic detection cost function (a-DCF), which allows targeting a desired trade-off between the contradicting aims of user convenience and robustness to spoofing. We combine a-DCF and binary cross-entropy (BCE) with a novel straightforward threshold optimization technique. Our results with an embedding fusion system on ASVspoof2019 data demonstrate relative improvement of 13% over a system trained using BCE only (from minimum a-DCF of 0.1445 to 0.1254). Using an alternative non-linear score fusion approach provides relative improvement of 43% (from minimum a-DCF of 0.0508 to 0.0289).","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"1081-1085"},"PeriodicalIF":3.2,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143629595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A 3D Reconstruction and Relocalization Method for Humanoid Welding Robots
IF 3.2 2区 工程技术
IEEE Signal Processing Letters Pub Date : 2025-02-24 DOI: 10.1109/LSP.2025.3544967
Peng Chi;Zhenmin Wang;Haipeng Liao;Ting Li;Qin Zhang
{"title":"A 3D Reconstruction and Relocalization Method for Humanoid Welding Robots","authors":"Peng Chi;Zhenmin Wang;Haipeng Liao;Ting Li;Qin Zhang","doi":"10.1109/LSP.2025.3544967","DOIUrl":"https://doi.org/10.1109/LSP.2025.3544967","url":null,"abstract":"Welding robots represent pivotal equipment in intelligent welding for manufacturing and maintenance. Presently, most welding robots are stationary single-arm units, exhibiting limited flexibility and efficiency, thereby compromising welding quality and productivity. Consequently, there is an urgent need to develop a new generation of humanoid welding robots (HWR) endowed with autonomous mobility and dual-arm collaborative capabilities. Key to this advancement are pose estimation and three-dimensional (3D) reconstruction methods, which traditionally focus on mapping and navigating unfamiliar environments, often struggling to adapt to the routine welding and maintenance scenes of large-scale equipment. This paper introduces a novel approach to 3D reconstruction and relocalization tailored for HWR, facilitating rapid localization of welding areas and transmission of point cloud maps. Initially, a vision-based 3D reconstruction system is proposed, encompassing pose estimation, 3D reconstruction, and target detection, enabling self-localization and precise targeting for HWR. Subsequently, a novel method for 3D point cloud map segmentation based on 2D features and 3D point clouds matching is introduced to expedite the transmission of point cloud maps. Finally, a relocalization and point cloud map updating method grounded in prior knowledge is proposed, facilitating seamless welding operations by HWR in routine maintenance scenes. The effectiveness and superiority of the proposed methodology are validated through comparative tests with existing methods using actual HWR.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"1071-1075"},"PeriodicalIF":3.2,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143627838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IR-Pro: Baking Probes to Model Indirect Illumination for Inverse Rendering of Scenes
IF 3.2 2区 工程技术
IEEE Signal Processing Letters Pub Date : 2025-02-24 DOI: 10.1109/LSP.2025.3545291
Zhihao Liang;Qi Zhang;Yirui Guan;Ying Feng;Kui Jia
{"title":"IR-Pro: Baking Probes to Model Indirect Illumination for Inverse Rendering of Scenes","authors":"Zhihao Liang;Qi Zhang;Yirui Guan;Ying Feng;Kui Jia","doi":"10.1109/LSP.2025.3545291","DOIUrl":"https://doi.org/10.1109/LSP.2025.3545291","url":null,"abstract":"Modeling indirect illumination to handle global illumination and decompose materials from multi-view images is challenging, especially in complex scenes with self-occlusion. While recent implicit neural representations show promise in inverse rendering, they struggle with efficient and effective modeling of indirect illumination. Besides, real-time global illumination techniques (e.g. Indirect Lighting Cache) have been successful in gaming. Inspired by this, we present a novel three-stage <bold>I</b>nverse <bold>R</b>enderer with <bold>Pro</b>bes (IR-Pro), which efficiently caches occlusion to handle indirect illumination. Experiments demonstrate the superiority of IR-Pro over existing methods in the inverse rendering of complex scenes. Furthermore, we successfully integrate the results into digital content creation software and showcase their effectiveness in applications, like relighting, simulation, and editing.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"1126-1130"},"PeriodicalIF":3.2,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143654933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Paradoxical Role of Adversarial Attacks: Enabling Crosslinguistic Attacks and Information Hiding in Multilingual Speech Recognition
IF 3.2 2区 工程技术
IEEE Signal Processing Letters Pub Date : 2025-02-24 DOI: 10.1109/LSP.2025.3545276
Wenjie Zhang;Zhihua Xia;Bin Ma;Diqun Yan
{"title":"Paradoxical Role of Adversarial Attacks: Enabling Crosslinguistic Attacks and Information Hiding in Multilingual Speech Recognition","authors":"Wenjie Zhang;Zhihua Xia;Bin Ma;Diqun Yan","doi":"10.1109/LSP.2025.3545276","DOIUrl":"https://doi.org/10.1109/LSP.2025.3545276","url":null,"abstract":"With the rise of automatic speech recognition (ASR) research and practical applications, enabling adversarial attacks on ASR systems via subtle perturbations has become a priority. Most prior research has focused on single-language, single-model ASR systems. However, multilingual ASR systems hold opportunities for crosslinguistic attacks and covert message transmission. This letter introduces a new approach for crosslinguistic adversarial attacks in multilingual ASR, focusing on information hiding. For example, in military settings, adversarial examples applied to eavesdropping devices can encode messages detectable only by friendly devices, leaving adversaries, even with identical methods, unable to access them. This letter examines multilingual ASR system properties and introduces a crosslinguistic adversarial example with minimal perturbation, allowing friendly classifiers to extract hidden information while being undetectable by hostile classifiers. The experimental results on 5 models and 5 datasets show that the proposed method achieves a success rate of over 90% and an SNR close to 40 dB.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"1046-1050"},"PeriodicalIF":3.2,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143594315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Two-Stage Imaging Framework Combining CNN and Physics-Informed Neural Networks for Full- Inverse Tomography: A Case Study in Electrical Impedance Tomography (EIT)
IF 3.2 2区 工程技术
IEEE Signal Processing Letters Pub Date : 2025-02-24 DOI: 10.1109/LSP.2025.3545306
Xuanxuan Yang;Yangming Zhang;Haofeng Chen;Gang Ma;Xiaojie Wang
{"title":"A Two-Stage Imaging Framework Combining CNN and Physics-Informed Neural Networks for Full- Inverse Tomography: A Case Study in Electrical Impedance Tomography (EIT)","authors":"Xuanxuan Yang;Yangming Zhang;Haofeng Chen;Gang Ma;Xiaojie Wang","doi":"10.1109/LSP.2025.3545306","DOIUrl":"https://doi.org/10.1109/LSP.2025.3545306","url":null,"abstract":"Electrical Impedance Tomography (EIT) is a highly ill-posed inverse problem, with the challenge of reconstructing internal conductivities using only boundary voltage measurements. Although Physics-Informed Neural Networks (PINNs) have shown potential in solving inverse problems, existing approaches are limited in their applicability to EIT, as they often rely on impractical prior knowledge and assumptions that cannot be satisfied in real-world scenarios. To address these limitations, we propose a two-stage hybrid learning framework that combines Convolutional Neural Networks (CNNs) and PINNs. This framework integrates data-driven and model-driven paradigms, blending supervised and unsupervised learning to reconstruct conductivity distributions while ensuring adherence to the underlying physical laws, thereby overcoming the constraints of existing methods.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"1096-1100"},"PeriodicalIF":3.2,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143676027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improved Low-Complexity Sparse Bayesian Learning With Embedded Bayesian Threshold
IF 3.2 2区 工程技术
IEEE Signal Processing Letters Pub Date : 2025-02-21 DOI: 10.1109/LSP.2025.3544536
Yifei Yang;Tengfei Qi;Qianli Wang;Pengcheng Zhu;Xiong Deng
{"title":"Improved Low-Complexity Sparse Bayesian Learning With Embedded Bayesian Threshold","authors":"Yifei Yang;Tengfei Qi;Qianli Wang;Pengcheng Zhu;Xiong Deng","doi":"10.1109/LSP.2025.3544536","DOIUrl":"https://doi.org/10.1109/LSP.2025.3544536","url":null,"abstract":"Sparse Bayesian Learning (SBL) is recognized for its efficacy in sparse signal recovery, the computational demand escalates significantly with increasing data dimensionality due to the matrix inversion at each iteration. An Inverse-Free sparse Bayesian Learning (IF-SBL) approach has been introduced to mitigate computational complexity. However, IF-SBL converges easily to a sub-optimal solution with false peaks due to the neglect of the correlation between atoms. In this paper, we analyze causes of false peaks in IF-SBL. Subsequently, a novel dynamically updated embedded Bayesian threshold is designed to mitigate the interference caused by false peaks. This innovative approach retrieves the stability and reliability without significantly increasing signal recovery complexity compared with IF-SBL. Simulation experiments validate the results.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"1066-1070"},"PeriodicalIF":3.2,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143629592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信