IEEE transactions on image processing : a publication of the IEEE Signal Processing Society最新文献

筛选
英文 中文
Tensor Nuclear Norm-Based Multi-Channel Atomic Representation for Robust Face Recognition
Yutao Hu;Yulong Wang;Libin Wang;Han Li;Hong Chen;Yuan Yan Tang
{"title":"Tensor Nuclear Norm-Based Multi-Channel Atomic Representation for Robust Face Recognition","authors":"Yutao Hu;Yulong Wang;Libin Wang;Han Li;Hong Chen;Yuan Yan Tang","doi":"10.1109/TIP.2025.3539472","DOIUrl":"10.1109/TIP.2025.3539472","url":null,"abstract":"Numerous representation-based classification (RC) methods have been developed for face recognition due to their decent model interpretability and robustness against noise. Most existing RC methods primarily characterize the gray-scale reconstruction error image (single-channel data) in two ways: the one-dimensional (1D) pixel-based error model and the two-dimensional (2D) gray-scale image-matrix-based error model. The former measures the reconstruction error pixel by pixel, while the latter leverages 2D structural information of the gray-scale error image, such as the low-rank property. However, when applying these methods to different color channels of a test color face image (multi-channel data) separately and independently, they neglect the three-dimensional (3D) structural correlations among distinct color channels. In real-world scenarios, face images are often contaminated with complex noise, including contiguous occlusion and random pixel corruption, which pose significant challenges to these approaches and can lead to a decline in performance. In this paper, we propose a Tensor Nuclear Norm based Robust Multi-channel Atomic Representation (TNN-RMAR) framework with application to color face recognition. The proposed method has the following three critical ingredients: 1) We propose a 3D color image-tensor-based error model, which can take full advantage of the 3D structural information of the color error image. 2) To leverage the 3D structural information of the color error image, we model it as a 3-order tensor <inline-formula> <tex-math>${mathcal {E}}$ </tex-math></inline-formula> and exploit its low-rank property with the tensor nuclear norm. Given that multiple color channels in a color image are generally corrupted at the same positions, we design a tube-wise tailored loss function to further leverage its tube-wise structure. 3) We devise the multi-channel atomic norm (MAN) regularization for the representation coefficient matrix, which allows us to jointly harness the correlation information of coefficients in different color channels. In addition, we also devise an efficient algorithm to solve the TNN-RMAR framework based on the alternating direction method of multipliers (ADMM) framework. By leveraging TNN-RMAR as a general platform, we also develop several novel robust multi-channel RC methods. Experimental results on benchmark real-world databases validate the effectiveness and robustness of the proposed framework for robust color face recognition.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"1311-1325"},"PeriodicalIF":0.0,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143443344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Latent Space Learning-Based Ensemble Clustering
Yalan Qin;Nan Pu;Nicu Sebe;Guorui Feng
{"title":"Latent Space Learning-Based Ensemble Clustering","authors":"Yalan Qin;Nan Pu;Nicu Sebe;Guorui Feng","doi":"10.1109/TIP.2025.3540297","DOIUrl":"10.1109/TIP.2025.3540297","url":null,"abstract":"Ensemble clustering fuses a set of base clusterings and shows promising capability in achieving more robust and better clustering results. The existing methods usually realize ensemble clustering by adopting a co-association matrix to measure how many times two data points are categorized into the same cluster based on the base clusterings. Though great progress has been achieved, the obtained co-association matrix is constructed based on the combination of different connective matrices or its variants. These methods ignore exploring the inherent latent space shared by multiple connective matrices and learning the corresponding co-association matrices according to this latent space. Moreover, these methods neglect to learn discriminative connective matrices, explore the high-order relation among these connective matrices and consider the latent space in a unified framework. In this paper, we propose a Latent spacE leArning baseD Ensemble Clustering (LEADEC), which introduces the latent space shared by different connective matrices and learns the corresponding connective matrices according to this latent space. Specifically, we factorize the original multiple connective matrices into a consensus latent space representation and the specific connective matrices. Meanwhile, the orthogonal constraint is imposed to make the latent space representation more discriminative. In addition, we collect the obtained connective matrices based on the latent space into a tensor with three orders to investigate the high-order relations among these connective matrices. The connective matrices learning, the high-order relation investigation among connective matrices and the latent space representation learning are integrated into a unified framework. Experiments on seven benchmark datasets confirm the superiority of LEADEC compared with the existing representive methods.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"1259-1270"},"PeriodicalIF":0.0,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143417828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Line-of-Sight Depth Attention for Panoptic Parsing of Distant Small-Faint Instances
Zhongqi Lin;Xudong Jiang;Zengwei Zheng
{"title":"Line-of-Sight Depth Attention for Panoptic Parsing of Distant Small-Faint Instances","authors":"Zhongqi Lin;Xudong Jiang;Zengwei Zheng","doi":"10.1109/TIP.2025.3540265","DOIUrl":"10.1109/TIP.2025.3540265","url":null,"abstract":"Current scene parsers have effectively distilled abstract relationships among refined instances, while overlooking the discrepancies arising from variations in scene depth. Hence, their potential to imitate the intrinsic 3D perception ability of humans is constrained. In accordance with the principle of perspective, we advocate first grading the depth of the scenes into several slices, and then digging semantic correlations within a slice or between multiple slices. Two attention-based components, namely the Scene Depth Grading Module (SDGM) and the Edge-oriented Correlation Refining Module (EoCRM), comprise our framework, the Line-of-Sight Depth Network (LoSDN). SDGM grades scene into several slices by calculating depth attention tendencies based on parameters with explicit physical meanings, e.g., albedo, occlusion, specular embeddings. This process allocates numerous multi-scale instances to each scene slice based on their line-of-sight extension distance, establishing a solid groundwork for ordered association mining in EoCRM. Since the primary step in distinguishing distant faint targets is boundary delineation, EoCRM implements edge-wise saliency quantification and association digging. Quantitative and diagnostic experiments on Cityscapes, ADE20K, and PASCAL Context datasets reveal the competitiveness of LoSDN and the individual contribution of each highlight. Visualizations display that our strategy offers clear benefits in detecting distant, faint targets.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"1354-1366"},"PeriodicalIF":0.0,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143417814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ADStereo: Efficient Stereo Matching With Adaptive Downsampling and Disparity Alignment
Yun Wang;Kunhong Li;Longguang Wang;Junjie Hu;Dapeng Oliver Wu;Yulan Guo
{"title":"ADStereo: Efficient Stereo Matching With Adaptive Downsampling and Disparity Alignment","authors":"Yun Wang;Kunhong Li;Longguang Wang;Junjie Hu;Dapeng Oliver Wu;Yulan Guo","doi":"10.1109/TIP.2025.3540282","DOIUrl":"10.1109/TIP.2025.3540282","url":null,"abstract":"The balance between accuracy and computational efficiency is crucial for the applications of deep learning-based stereo matching algorithms in real-world scenarios. Since matching cost aggregation is usually the most computationally expensive component, a common practice is to construct cost volumes at a low resolution for aggregation and then directly regress a high-resolution disparity map. However, current solutions often suffer from limitations such as the loss of discriminative features caused by downsampling operations that treat all pixels equally, and spatial misalignment resulting from repeated downsampling and upsampling. To overcome these challenges, this paper presents two sampling strategies: the Adaptive Downsampling Module (ADM) and the Disparity Alignment Module (DAM), to prioritize real-time inference while ensuring accuracy. The ADM leverages local features to learn adaptive weights, enabling more effective downsampling while preserving crucial structure information. On the other hand, the DAM employs a learnable interpolation strategy to predict transformation offsets of pixels, thereby mitigating the spatial misalignment issue. Building upon these modules, we introduce ADStereo, a real-time yet accurate network that achieves highly competitive performance on multiple public benchmarks. Specifically, our ADStereo runs over <inline-formula> <tex-math>$5times $ </tex-math></inline-formula> faster than the current state-of-the-art CREStereo (0.054s vs. <inline-formula> <tex-math>$0.29{s}$ </tex-math></inline-formula>) under the same hardware while achieving comparable accuracy (1.82% vs. 1.69%) on the KITTI stereo 2015 benchmark. The codes are available at: <uri>https://github.com/cocowy1/ADStereo</uri>.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"1204-1218"},"PeriodicalIF":0.0,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143417815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Contrastive Neuron Pruning for Backdoor Defense
Yu Feng;Benteng Ma;Dongnan Liu;Yanning Zhang;Weidong Cai;Yong Xia
{"title":"Contrastive Neuron Pruning for Backdoor Defense","authors":"Yu Feng;Benteng Ma;Dongnan Liu;Yanning Zhang;Weidong Cai;Yong Xia","doi":"10.1109/TIP.2025.3539466","DOIUrl":"10.1109/TIP.2025.3539466","url":null,"abstract":"Recent studies have revealed that deep neural networks (DNNs) are susceptible to backdoor attacks, in which attackers insert a pre-defined backdoor into a DNN model by poisoning a few training samples. A small subset of neurons in DNN is responsible for activating this backdoor and pruning these backdoor-associated neurons has been shown to mitigate the impact of such attacks. Current neuron pruning techniques often face challenges in accurately identifying these critical neurons, and they typically depend on the availability of labeled clean data, which is not always feasible. To address these challenges, we propose a novel defense strategy called Contrastive Neuron Pruning (CNP). This approach is based on the observation that poisoned samples tend to cluster together and are distinguishable from benign samples in the feature space of a backdoored model. Given a backdoored model, we initially apply a reversed trigger to benign samples, generating multiple positive (benign-benign) and negative (benign-poisoned) feature pairs from the backdoored model. We then employ contrastive learning on these pairs to improve the separation between benign and poisoned features. Subsequently, we identify and prune neurons in the Batch Normalization layers that show significant response differences to the generated pairs. By removing these backdoor-associated neurons, CNP effectively defends against backdoor attacks while requiring the pruning of only about 1% of the total neurons. Comprehensive experiments conducted on various benchmarks validate the efficacy of CNP, demonstrating its robustness and effectiveness in mitigating backdoor attacks compared to existing methods.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"1234-1245"},"PeriodicalIF":0.0,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143417817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Shell-Guided Compression of Voxel Radiance Fields
Peiqi Yang;Zhangkai Ni;Hanli Wang;Wenhan Yang;Shiqi Wang;Sam Kwong
{"title":"Shell-Guided Compression of Voxel Radiance Fields","authors":"Peiqi Yang;Zhangkai Ni;Hanli Wang;Wenhan Yang;Shiqi Wang;Sam Kwong","doi":"10.1109/TIP.2025.3538163","DOIUrl":"10.1109/TIP.2025.3538163","url":null,"abstract":"In this paper, we address the challenge of significant memory consumption and redundant components in large-scale voxel-based model, which are commonly encountered in real-world 3D reconstruction scenarios. We propose a novel method called Shell-guided compression of Voxel Radiance Fields (SVRF), aimed at optimizing voxel-based model into a shell-like structure to reduce storage costs while maintaining rendering accuracy. Specifically, we first introduce a Shell-like Constraint, operating in two main aspects: 1) enhancing the influence of voxels neighboring the surface in determining the rendering outcomes, and 2) expediting the elimination of redundant voxels both inside and outside the surface. Additionally, we introduce an Adaptive Thresholds to ensure appropriate pruning criteria for different scenes. To prevent the erroneous removal of essential object parts, we further employ a Dynamic Pruning Strategy to conduct smooth and precise model pruning during training. The compression method we propose does not necessitate the use of additional labels. It merely requires the guidance of self-supervised learning based on predicted depth. Furthermore, it can be seamlessly integrated into any voxel-grid-based method. Extensive experimental results demonstrate that our method achieves comparable rendering quality while compressing the original number of voxel grids by more than 70%. Our code will be available at: <uri>https://github.com/eezkni/SVRF</uri>","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"1179-1191"},"PeriodicalIF":0.0,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143385356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data Subdivision Based Dual-Weighted Robust Principal Component Analysis
Sisi Wang;Feiping Nie;Zheng Wang;Rong Wang;Xuelong Li
{"title":"Data Subdivision Based Dual-Weighted Robust Principal Component Analysis","authors":"Sisi Wang;Feiping Nie;Zheng Wang;Rong Wang;Xuelong Li","doi":"10.1109/TIP.2025.3536197","DOIUrl":"10.1109/TIP.2025.3536197","url":null,"abstract":"Principal Component Analysis (PCA) is one of the most important unsupervised dimensionality reduction algorithms, which uses squared <inline-formula> <tex-math>$ell _{2}$ </tex-math></inline-formula>-norm to make it very sensitive to outliers. Those improved versions based on <inline-formula> <tex-math>$ell _{1}$ </tex-math></inline-formula>-norm alleviate this problem, but they have other shortcomings, such as optimization difficulties or lack of rotational invariance, etc. Besides, existing methods only vaguely divide normal samples and outliers to improve robustness, but they ignore the fact that normal samples can be more specifically divided into positive samples and hard samples, which should have different contributions to the model because positive samples are more conducive to learning the projection matrix. In this paper, we propose a novel Data Subdivision Based Dual-Weighted Robust Principal Component Analysis, namely DRPCA, which firstly designs a mark vector to distinguish normal samples and outliers, and directly removes outliers according to mark weights. Moreover, we further divide normal samples into positive samples and hard samples by self-constrained weights, and place them in relative positions, so that the weight of positive samples is larger than hard samples, which makes the projection matrix more accurate. Additionally, the optimal mean is employed to obtain a more accurate data center. To solve this problem, we carefully design an effective iterative algorithm and analyze its convergence. Experiments on real-world and RGB large-scale datasets demonstrate the superiority of our method in dimensionality reduction and anomaly detection.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"1271-1284"},"PeriodicalIF":0.0,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143385355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Decouple and Couple: Exploiting Prior Knowledge for Visible Video Watermark Removal
Junye Chen;Chaowei Fang;Jichang Li;Yicheng Leng;Guanbin Li
{"title":"Decouple and Couple: Exploiting Prior Knowledge for Visible Video Watermark Removal","authors":"Junye Chen;Chaowei Fang;Jichang Li;Yicheng Leng;Guanbin Li","doi":"10.1109/TIP.2025.3534033","DOIUrl":"10.1109/TIP.2025.3534033","url":null,"abstract":"This paper aims to restore original background images in watermarked videos, overcoming challenges posed by traditional approaches that fail to handle the temporal dynamics and diverse watermark characteristics effectively. Our method introduces a unique framework that first “decouples” the extraction of prior knowledge—such as common-sense knowledge and residual background details—from the temporal modeling process, allowing for independent handling of background restoration and temporal consistency. Subsequently, it “couples” these extracted features by integrating them into the temporal modeling backbone of a video inpainting (VI) framework. This integration is facilitated by a specialized module, which includes an intrinsic background image prediction sub-module and a dual-branch frame embedding module, designed to reduce watermark interference and enhance the application of prior knowledge. Moreover, a frame-adaptive feature selection module dynamically adjusts the extraction of prior features based on the corruption level of each frame, ensuring their effective incorporation into the temporal processing. Extensive experiments on YouTube-VOS and DAVIS datasets validate our method’s efficiency in watermark removal and background restoration, showing significant improvement over state-of-the-art techniques in visible image watermark removal, video restoration, and video inpainting.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"1192-1203"},"PeriodicalIF":0.0,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143125272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
VSR-Net: Vessel-Like Structure Rehabilitation Network With Graph Clustering
Haili Ye;Xiaoqing Zhang;Yan Hu;Huazhu Fu;Jiang Liu
{"title":"VSR-Net: Vessel-Like Structure Rehabilitation Network With Graph Clustering","authors":"Haili Ye;Xiaoqing Zhang;Yan Hu;Huazhu Fu;Jiang Liu","doi":"10.1109/TIP.2025.3526061","DOIUrl":"10.1109/TIP.2025.3526061","url":null,"abstract":"The morphologies of vessel-like structures, such as blood vessels and nerve fibres, play significant roles in disease diagnosis, e.g., Parkinson’s disease. Although deep network-based refinement segmentation and topology-preserving segmentation methods recently have achieved promising results in segmenting vessel-like structures, they still face two challenges: 1) existing methods often have limitations in rehabilitating subsection ruptures in segmented vessel-like structures; 2) they are typically overconfident in predicted segmentation results. To tackle these two challenges, this paper attempts to leverage the potential of spatial interconnection relationships among subsection ruptures from the structure rehabilitation perspective. Based on this perspective, we propose a novel Vessel-like Structure Rehabilitation Network (VSR-Net) to both rehabilitate subsection ruptures and improve the model calibration based on coarse vessel-like structure segmentation results. VSR-Net first constructs subsection rupture clusters via a Curvilinear Clustering Module (CCM). Then, the well-designed Curvilinear Merging Module (CMM) is applied to rehabilitate the subsection ruptures to obtain the refined vessel-like structures. Extensive experiments on six 2D/3D medical image datasets show that VSR-Net significantly outperforms state-of-the-art (SOTA) refinement segmentation methods with lower calibration errors. Additionally, we provide quantitative analysis to explain the morphological difference between the VSR-Net’s rehabilitation results and ground truth (GT), which are smaller compared to those between SOTA methods and GT, demonstrating that our method more effectively rehabilitates vessel-like structures.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"1090-1105"},"PeriodicalIF":0.0,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143125271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Joint Spatial and Frequency Domain Learning for Lightweight Spectral Image Demosaicing
Fangfang Wu;Tao Huang;Junwei Xu;Xun Cao;Weisheng Dong;Le Dong;Guangming Shi
{"title":"Joint Spatial and Frequency Domain Learning for Lightweight Spectral Image Demosaicing","authors":"Fangfang Wu;Tao Huang;Junwei Xu;Xun Cao;Weisheng Dong;Le Dong;Guangming Shi","doi":"10.1109/TIP.2025.3536217","DOIUrl":"10.1109/TIP.2025.3536217","url":null,"abstract":"Conventional spectral image demosaicing algorithms rely on pixels’ spatial or spectral correlations for reconstruction. Due to the missing data in the multispectral filter array (MSFA), the estimation of spatial or spectral correlations is inaccurate, leading to poor reconstruction results, and these algorithms are time-consuming. Deep learning-based spectral image demosaicing methods directly learn the nonlinear mapping relationship between 2D spectral mosaic images and 3D multispectral images. However, these learning-based methods focused only on learning the mapping relationship in the spatial domain, but neglected valuable image information in the frequency domain, resulting in limited reconstruction quality. To address the above issues, this paper proposes a novel lightweight spectral image demosaicing method based on joint spatial and frequency domain information learning. First, a novel parameter-free spectral image initialization strategy based on the Fourier transform is proposed, which leads to better initialized spectral images and eases the difficulty of subsequent spectral image reconstruction. Furthermore, an efficient spatial-frequency transformer network is proposed, which jointly learns the spatial correlations and the frequency domain characteristics. Compared to existing learning-based spectral image demosaicing methods, the proposed method significantly reduces the number of model parameters and computational complexity. Extensive experiments on simulated and real-world data show that the proposed method notably outperforms existing spectral image demosaicing methods.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"1119-1132"},"PeriodicalIF":0.0,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143125098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信