IEEE transactions on image processing : a publication of the IEEE Signal Processing Society最新文献

筛选
英文 中文
Frequency-Spatial Complementation: Unified Channel-Specific Style Attack for Cross-Domain Few-Shot Learning 频率-空间互补:跨域少射学习的统一通道特定风格攻击
Zhong Ji;Zhilong Wang;Xiyao Liu;Yunlong Yu;Yanwei Pang;Jungong Han
{"title":"Frequency-Spatial Complementation: Unified Channel-Specific Style Attack for Cross-Domain Few-Shot Learning","authors":"Zhong Ji;Zhilong Wang;Xiyao Liu;Yunlong Yu;Yanwei Pang;Jungong Han","doi":"10.1109/TIP.2025.3553781","DOIUrl":"10.1109/TIP.2025.3553781","url":null,"abstract":"Cross-Domain Few-Shot Learning (CD-FSL) addresses the challenges of recognizing targets with out-of-domain data when only a few instances are available. Many current CD-FSL approaches primarily focus on enhancing the generalization capabilities of models in spatial domain, which neglects the role of the frequency domain in domain generalization. To take advantage of frequency domain in processing global information, we propose a Frequency-Spatial Complementation (FSC) model, which combines frequency domain information with spatial domain information to learn domain-invariant information from attacked data style. Specifically, we design a Frequency and Spatial Fusion (FusionFS) module to enhance the ability of the model to capture style-related information. Besides, we propose two attack strategies, i.e., the Gradient-guided Unified Style Attack (GUSA) strategy and the Channel-specific Attack Intensity Calculation (CAIC) strategy, which conduct targeted attacks on different channels to provide more diversified style data during the training phase, especially in single-source domain scenarios where the source domain data style is homogeneous. Extensive experiments across eight target domains demonstrate that our method significantly improves the model’s performance under various styles.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"2242-2253"},"PeriodicalIF":0.0,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143733994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive Dual-Axis Style-Based Recalibration Network With Class-Wise Statistics Loss for Imbalanced Medical Image Classification 基于类统计损失的自适应双轴风格再校准网络用于不平衡医学图像分类
Xiaoqing Zhang;Zunjie Xiao;Jingzhe Ma;Xiao Wu;Jilu Zhao;Shuai Zhang;Runzhi Li;Yi Pan;Jiang Liu
{"title":"Adaptive Dual-Axis Style-Based Recalibration Network With Class-Wise Statistics Loss for Imbalanced Medical Image Classification","authors":"Xiaoqing Zhang;Zunjie Xiao;Jingzhe Ma;Xiao Wu;Jilu Zhao;Shuai Zhang;Runzhi Li;Yi Pan;Jiang Liu","doi":"10.1109/TIP.2025.3551128","DOIUrl":"10.1109/TIP.2025.3551128","url":null,"abstract":"Salient and small lesions (e.g., microaneurysms on fundus) both play significant roles in real-world disease diagnosis under medical image examinations. Although deep neural networks (DNNs) have achieved promising medical image classification performance, they often have limitations in capturing both salient and small lesion information, restricting performance improvement in imbalanced medical image classification. Recently, with the advent of DNN-based style transfer in medical image generation, the roles of clinical styles have attracted great interest, as they are crucial indicators of lesions. Motivated by this observation, we propose a novel Adaptive Dual-Axis Style-based Recalibration (ADSR) module, leveraging the potential of clinical styles to guide DNNs in effectively learning salient and small lesion information from a dual-axis perspective. ADSR first emphasizes salient lesion information via global style-based adaptation, then captures small lesion information with pixel-wise style-based fusion. We construct an ADSR-Net for imbalanced medical image classification by stacking multiple ADSR modules. Additionally, DNNs typically adopt cross-entropy loss for parameter optimization, which ignores the impacts of class-wise predicted probability distributions. To address this, we introduce a new Class-wise Statistics Loss (CWS) combined with CE to further boost imbalanced medical image classification results. Extensive experiments on five imbalanced medical image datasets demonstrate not only the superiority of ADSR-Net and CWS over state-of-the-art (SOTA) methods but also their improved confidence calibration results. For example, ADSR-Net with the proposed loss significantly outperforms CABNet50 by 21.39% and 27.82% in F1 and B-ACC while reducing 3.31% and 4.57% in ECE and BS on ISIC2018.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"2081-2096"},"PeriodicalIF":0.0,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143733996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Perception Assisted Transformer for Unsupervised Object Re-Identification 无监督对象再识别的感知辅助变压器
Shuoyi Chen;Mang Ye;Xingping Dong;Bo Du
{"title":"Perception Assisted Transformer for Unsupervised Object Re-Identification","authors":"Shuoyi Chen;Mang Ye;Xingping Dong;Bo Du","doi":"10.1109/TIP.2025.3553777","DOIUrl":"10.1109/TIP.2025.3553777","url":null,"abstract":"Unsupervised object re-identification (Re-ID) aims to learn discriminative features without identity annotations. Existing mainstream methods are usually developed based on convolutional neural networks for feature extraction and pseudo-label estimation. However, convolutional neural networks suffer from limitations in capturing dispersed long-range dependencies and integrating global information. In comparison, vision transformers demonstrate superior robustness in complex environments, leveraging their versatile modeling capabilities to process diverse data structures with greater precision. In this paper, we delve into the potential of vision transformers in unsupervised Re-ID, proposing a Transformer-based perception-assisted framework (PAT). Considering Re-ID is a typical fine-grained task, existing unsupervised Re-ID methods relying on pseudo-labels generated by clustering algorithms provide only category-level discriminative supervision, with limited attention to local details. Therefore, we propose a novel target-aware mask alignment (TMA) strategy that provides additional supervision signals by leveraging low-level visual cues. Specifically, we employ pseudo-labels to guide the fine-grained alignment of features with local pixel information from critical discriminative regions. This method establishes a mutual learning mechanism via a shared Transformer, effectively balancing discriminative learning and detailed understanding. Furthermore, we propose a perceptual fusion feature augmentation (PFA) method to optimize instance-level discriminative learning. The proposed method is evaluated on multiple Re-ID datasets, demonstrating superior performance and robustness in comparison to state-of-the-art techniques. Notably, without annotations, our method achieves better results than many supervised counterparts. The code will be released.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"2112-2123"},"PeriodicalIF":0.0,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143723295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mutually Reinforcing Learning of Decoupled Degradation and Diffusion Enhancement for Unpaired Low-Light Image Lightening 解耦退化和扩散增强的相互强化学习用于未配对弱光图像的光照
Kangle Wu;Jun Huang;Yong Ma;Fan Fan;Jiayi Ma
{"title":"Mutually Reinforcing Learning of Decoupled Degradation and Diffusion Enhancement for Unpaired Low-Light Image Lightening","authors":"Kangle Wu;Jun Huang;Yong Ma;Fan Fan;Jiayi Ma","doi":"10.1109/TIP.2025.3553070","DOIUrl":"10.1109/TIP.2025.3553070","url":null,"abstract":"Denoising Diffusion Probabilistic Model (DDPM) has demonstrated exceptional performance in low-light enhancement task. However, the dependency on paired training datas has left the generality of DDPM in low-light enhancement largely untapped. Therefore, this paper proposes a mutually reinforcing learning framework of decoupled degradation and diffusion enhancement, named MRLIE, which leverages style guidance from unpaired low-light images to generate pseudo-image pairs that are consistent with the target domain, thereby optimizing the latter diffusion enhancement network in a supervised manner. During the degradation process, the diffusion loss of fixed enhancement network serves as a evaluation metric for structure consistency and is combined with adversarial style loss to form the optimization objective for degradation network. Such loss design ensures that scene structure information is retained during the degradation process. During the enhancement process, the degradation network with frozen parameters continuously generates pseudo-paired low-/normal-light image pairs as training datas, thus the diffusion enhancement network could be progressively optimized. On the whole, the two processes are interdependent and could achieve cooperative improvement in terms of degradation realism and enhancement quality through iterative optimization. Additionally, we propose the Retinex-based decoupled degradation strategy for simulating the complex degradation in real low-light imaging, which ensures the color correction and noise suppression capabilities of latter diffusion enhancement network. Extensive experiments show that MRLIE can achieve promising results and better generality across various datasets.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"2020-2035"},"PeriodicalIF":0.0,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143723297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
WSSIC-Net: Weakly-Supervised Semantic Instance Completion of 3D Point Cloud Scenes WSSIC-Net:三维点云场景的弱监督语义实例完成
Zhiheng Fu;Yulan Guo;Minglin Chen;Qingyong Hu;Hamid Laga;Farid Boussaid;Mohammed Bennamoun
{"title":"WSSIC-Net: Weakly-Supervised Semantic Instance Completion of 3D Point Cloud Scenes","authors":"Zhiheng Fu;Yulan Guo;Minglin Chen;Qingyong Hu;Hamid Laga;Farid Boussaid;Mohammed Bennamoun","doi":"10.1109/TIP.2024.3520013","DOIUrl":"10.1109/TIP.2024.3520013","url":null,"abstract":"Semantic instance completion aims to recover the complete 3D shapes of foreground objects together with their labels from a partial 2.5D scan of a scene. Previous works have relied on full supervision, which requires ground-truth annotations, in the form of bounding boxes and complete 3D objects. This has greatly limited their real-world application because the acquisition of ground-truth data is very costly and time-consuming. To address this bottleneck, we propose a Weakly-Supervised Semantic Instance Completion Network (WSSIC-Net), which learns real-world partial point cloud object completion without requiring the ground truth of complete 3D objects. Instead, WSSIC-Net leverages 3D ground-truth bounding boxes, partial objects of a raw scene, and unpaired synthetic 3D point clouds. More specifically, a 3D detector is used to encode partial point clouds into proposal features, which are then fed into two branches. The first branch uses fully supervised box prediction based on proposal features. The second branch, hereinafter called instance completion, leverages the proposal features as partial object features to achieve weakly-supervised instance completion. A Generative Adversarial Network (GAN) completes the partial features of the 2.5D foreground objects of real-world scenes using only unpaired but semantically-consistent complete synthetic point clouds. In our experiments, we demonstrate that the fully-supervised 3D detection and the weakly-supervised instance completion complement one another. The qualitative and quantitative evaluations on the ScanNet v2 dataset demonstrate that the proposed “weakly-supervised” approach consistently achieves comparable performance to the state-of-the-art “fully supervised” methods.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"2008-2019"},"PeriodicalIF":0.0,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143723298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DD-RobustBench: An Adversarial Robustness Benchmark for Dataset Distillation DD-RobustBench:数据集蒸馏的对抗鲁棒性基准
Yifan Wu;Jiawei Du;Ping Liu;Yuewei Lin;Wei Xu;Wenqing Cheng
{"title":"DD-RobustBench: An Adversarial Robustness Benchmark for Dataset Distillation","authors":"Yifan Wu;Jiawei Du;Ping Liu;Yuewei Lin;Wei Xu;Wenqing Cheng","doi":"10.1109/TIP.2025.3553786","DOIUrl":"10.1109/TIP.2025.3553786","url":null,"abstract":"Dataset distillation techniques have revolutionized the way of utilizing large datasets by compressing them into smaller, yet highly effective subsets that preserve the original datasets’ accuracy. However, while these methods have proven effective in reducing data size and training times, the robustness of these distilled datasets against adversarial attacks remains underexplored. This vulnerability poses significant risks, particularly in security-sensitive applications. To address this critical gap, we introduce DD-RobustBench, a novel and comprehensive benchmark specifically designed to evaluate the adversarial robustness of distilled datasets. Our benchmark is the most extensive of its kind and integrates a variety of dataset distillation techniques, including recent advancements such as TESLA, DREAM, SRe2L, and D4M, which have shown promise in enhancing model performance. DD-RobustBench also rigorously tests these datasets against a diverse array of adversarial attack methods to ensure broad applicability. Our evaluations cover a wide spectrum of datasets, including but not limited to, the widely used ImageNet-1K. This allows us to assess the robustness of distilled datasets in scenarios mirroring real-world applications. Furthermore, our detailed quantitative analysis investigates how different components involved in the distillation process, such as data augmentation, downsampling, and clustering, affect dataset robustness. Our findings provide critical insights into which techniques enhance or weaken the resilience of distilled datasets against adversarial threats, offering valuable guidelines for developing more robust distillation methods in the future. Through DD-RobustBench, we aim not only to benchmark but also to push the boundaries of dataset distillation research by highlighting areas for improvement and suggesting pathways for future innovations in creating datasets that are not only compact and efficient but also secure and resilient to adversarial challenges. The implementation details and essential instructions are available on DD-RobustBench.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"2052-2066"},"PeriodicalIF":0.0,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10944256","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143723296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
See Degraded Objects: A Physics-Guided Approach for Object Detection in Adverse Environments 参见退化对象:在不利环境中进行对象检测的物理指导方法
Weifeng Liu;Jian Pang;Bingfeng Zhang;Jin Wang;Baodi Liu;Dapeng Tao
{"title":"See Degraded Objects: A Physics-Guided Approach for Object Detection in Adverse Environments","authors":"Weifeng Liu;Jian Pang;Bingfeng Zhang;Jin Wang;Baodi Liu;Dapeng Tao","doi":"10.1109/TIP.2025.3551533","DOIUrl":"10.1109/TIP.2025.3551533","url":null,"abstract":"In adverse environments, the detector often fails to detect degraded objects because they are almost invisible and their features are weakened by the environment. Common approaches involve image enhancement to support detection, but they inevitably introduce human-invisible noise that negatively impacts the detector. In this work, we propose a physics-guided approach for object detection in adverse environments, which gives a straightforward solution that injects the physical priors into the detector, enabling it to detect poorly visible objects. The physical priors, derived from the imaging mechanism and image property, include environment prior and frequency prior. The environment prior is generated from the physical model, e.g., the atmospheric model, which reflects the density of environmental noise. The frequency prior is explored based on an observation that the amplitude spectrum could highlight object regions from the background. The proposed two priors are complementary in principle. Furthermore, we present a physics-guided loss that incorporates a novel weight item, which is estimated by applying the membership function on physical priors and could capture the extent of degradation. By backpropagating the physics-guided loss, physics knowledge is injected into the detector to aid in locating degraded objects. We conduct experiments in synthetic foggy environment, real foggy environment, and real underwater scenario. The results demonstrate that our method is effective and achieves state-of-the-art performance. The code is available at <uri>https://github.com/PangJian123/See-Degraded-Objects</uri>.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"2198-2212"},"PeriodicalIF":0.0,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143723299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
STPNet: Scale-Aware Text Prompt Network for Medical Image Segmentation 医学图像分割的尺度感知文本提示网络
Dandan Shan;Zihan Li;Yunxiang Li;Qingde Li;Jie Tian;Qingqi Hong
{"title":"STPNet: Scale-Aware Text Prompt Network for Medical Image Segmentation","authors":"Dandan Shan;Zihan Li;Yunxiang Li;Qingde Li;Jie Tian;Qingqi Hong","doi":"10.1109/TIP.2025.3571672","DOIUrl":"10.1109/TIP.2025.3571672","url":null,"abstract":"Accurate segmentation of lesions plays a critical role in medical image analysis and diagnosis. Traditional segmentation approaches that rely solely on visual features often struggle with the inherent uncertainty in lesion distribution and size. To address these issues, we propose STPNet, a Scale-aware Text Prompt Network that leverages vision-language modeling to enhance medical image segmentation. Our approach utilizes multi-scale textual descriptions to guide lesion localization and employs retrieval-segmentation joint learning to bridge the semantic gap between visual and linguistic modalities. Crucially, STPNet retrieves relevant textual information from a specialized medical text repository during training, eliminating the need for text input during inference while retaining the benefits of cross-modal learning. We evaluate STPNet on three datasets: COVID-Xray, COVID-CT, and Kvasir-SEG. Experimental results show that our vision-language approach outperforms state-of-the-art segmentation methods, demonstrating the effectiveness of incorporating textual semantic knowledge into medical image analysis. The code has been made publicly on <uri>https://github.com/HUANGLIZI/STPNet</uri>","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"3169-3180"},"PeriodicalIF":0.0,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144145507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Irregular Tensor Low-Rank Representation for Hyperspectral Image Representation 高光谱图像表示中的不规则张量低秩表示
Bo Han;Yuheng Jia;Hui Liu;Junhui Hou
{"title":"Irregular Tensor Low-Rank Representation for Hyperspectral Image Representation","authors":"Bo Han;Yuheng Jia;Hui Liu;Junhui Hou","doi":"10.1109/TIP.2025.3571669","DOIUrl":"10.1109/TIP.2025.3571669","url":null,"abstract":"Spectral variations pose a common challenge in analyzing hyperspectral images (HSI). To address this, low-rank tensor representation has emerged as a robust strategy, leveraging inherent correlations within HSI data. However, the spatial distribution of ground objects in HSIs is inherently irregular, existing naturally in tensor format, with numerous class-specific regions manifesting as irregular tensors. Current low-rank representation techniques are designed for regular tensor structures and overlook this fundamental irregularity in real-world HSIs, leading to performance limitations. To tackle this issue, we propose a novel model for irregular tensor low-rank representation tailored to efficiently model irregular 3D cubes. By incorporating a non-convex nuclear norm to promote low-rankness and integrating a global negative low-rank term to enhance the discriminative ability, our proposed model is formulated as a constrained optimization problem and solved using an alternating augmented Lagrangian method. Experimental validation conducted on four public datasets demonstrates the superior performance of our method compared to existing state-of-the-art approaches. The code is publicly available at <uri>https://github.com/hb-studying/ITLRR</uri>","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"3239-3252"},"PeriodicalIF":0.0,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144145506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SPU+: Dimension Folding for Semantic Point Cloud Upsampling SPU+:语义点云上采样的维度折叠
Zhuangzi Li;Thomas H. Li;Shan Liu;Ge Li
{"title":"SPU+: Dimension Folding for Semantic Point Cloud Upsampling","authors":"Zhuangzi Li;Thomas H. Li;Shan Liu;Ge Li","doi":"10.1109/TIP.2025.3571680","DOIUrl":"10.1109/TIP.2025.3571680","url":null,"abstract":"Semantic Point Cloud Upsampling (SPU) aims to reconstruct a high-resolution (dense) 3D point cloud from a low-resolution (sparse) one, ensuring that the upsampled point cloud is easily recognizable by downstream tasks. Conventional upsampling architectures typically represent point clouds using high-dimensional feature vectors. However, we observe a dimensional bottleneck, where simply increasing the feature dimensionality does not necessarily improve performance on semantic tasks. This insight motivates us to explore more effective feature representations within upsampling networks. In this paper, we propose a novel SPU method called SPU+, which introduces dimension folding as an alternative strategy for handling high-dimensional features. Specifically, SPU+ decomposes each high-dimensional feature into several <italic>g</i>-dimensional packages, allowing interactions among packages within the feature space. Guided by the principle of maximizing feature diversity, we determine that setting the package dimension to 3 yields optimal performance. To enable convolutional operations over these 3D packages, we present a 3D Residual Graph Convolution Block (3D-RGCB) that achieves high computational efficiency. Based on 3D-RGCBs, we design an upsampling network that incorporates three structural modes: pre-mode, middle-mode, and end-mode. Additionally, for large-scale upsampling, we develop a scaling-and-shuffling strategy that adaptively adjusts the spatial size of each 3D package. Finally, we analyze the covering number of the 3D package representation and compare it to traditional high-dimensional feature representations. Experiments on publicly available datasets demonstrate not only the effectiveness of dimension folding but also the state-of-the-art performance achieved by SPU+. Code is available at: <uri>https://github.com/lizhuangzi/SPU_plus</uri>","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"3389-3402"},"PeriodicalIF":0.0,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144145505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信