IET Computer Vision最新文献

筛选
英文 中文
Context-aware relation enhancement and similarity reasoning for image-text retrieval 用于图像文本检索的上下文感知关系增强和相似性推理
IF 1.5 4区 计算机科学
IET Computer Vision Pub Date : 2024-01-30 DOI: 10.1049/cvi2.12270
Zheng Cui, Yongli Hu, Yanfeng Sun, Baocai Yin
{"title":"Context-aware relation enhancement and similarity reasoning for image-text retrieval","authors":"Zheng Cui,&nbsp;Yongli Hu,&nbsp;Yanfeng Sun,&nbsp;Baocai Yin","doi":"10.1049/cvi2.12270","DOIUrl":"10.1049/cvi2.12270","url":null,"abstract":"<p>Image-text retrieval is a fundamental yet challenging task, which aims to bridge a semantic gap between heterogeneous data to achieve precise measurements of semantic similarity. The technique of fine-grained alignment between cross-modal features plays a key role in various successful methods that have been proposed. Nevertheless, existing methods cannot effectively utilise intra-modal information to enhance feature representation and lack powerful similarity reasoning to get a precise similarity score. Intending to tackle these issues, a context-aware Relation Enhancement and Similarity Reasoning model, called RESR, is proposed, which conducts both intra-modal relation enhancement and inter-modal similarity reasoning while considering the global-context information. For intra-modal relation enhancement, a novel context-aware graph convolutional network is introduced to enhance local feature representations by utilising relation and global-context information. For inter-modal similarity reasoning, local and global similarity features are exploited by the bidirectional alignment of image and text, and the similarity reasoning is implemented among multi-granularity similarity features. Finally, refined local and global similarity features are adaptively fused to get a precise similarity score. The experimental results show that our effective model outperforms some state-of-the-art approaches, achieving average improvements of 2.5% and 6.3% in R@sum on the Flickr30K and MS-COCO dataset.</p>","PeriodicalId":56304,"journal":{"name":"IET Computer Vision","volume":"18 5","pages":"652-665"},"PeriodicalIF":1.5,"publicationDate":"2024-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cvi2.12270","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140483593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
OmDet: Large-scale vision-language multi-dataset pre-training with multimodal detection network OmDet:利用多模态检测网络进行大规模视觉语言多数据集预训练
IF 1.5 4区 计算机科学
IET Computer Vision Pub Date : 2024-01-24 DOI: 10.1049/cvi2.12268
Tiancheng Zhao, Peng Liu, Kyusong Lee
{"title":"OmDet: Large-scale vision-language multi-dataset pre-training with multimodal detection network","authors":"Tiancheng Zhao,&nbsp;Peng Liu,&nbsp;Kyusong Lee","doi":"10.1049/cvi2.12268","DOIUrl":"10.1049/cvi2.12268","url":null,"abstract":"<p>The advancement of object detection (OD) in open-vocabulary and open-world scenarios is a critical challenge in computer vision. OmDet, a novel language-aware object detection architecture and an innovative training mechanism that harnesses continual learning and multi-dataset vision-language pre-training is introduced. Leveraging natural language as a universal knowledge representation, OmDet accumulates “visual vocabularies” from diverse datasets, unifying the task as a language-conditioned detection framework. The multimodal detection network (MDN) overcomes the challenges of multi-dataset joint training and generalizes to numerous training datasets without manual label taxonomy merging. The authors demonstrate superior performance of OmDet over strong baselines in object detection in the wild, open-vocabulary detection, and phrase grounding, achieving state-of-the-art results. Ablation studies reveal the impact of scaling the pre-training visual vocabulary, indicating a promising direction for further expansion to larger datasets. The effectiveness of our deep fusion approach is underscored by its ability to learn jointly from multiple datasets, enhancing performance through knowledge sharing.</p>","PeriodicalId":56304,"journal":{"name":"IET Computer Vision","volume":"18 5","pages":"626-639"},"PeriodicalIF":1.5,"publicationDate":"2024-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cvi2.12268","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139601188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SIANet: 3D object detection with structural information augment network SIANet:利用结构信息增强网络进行 3D 物体检测
IF 1.5 4区 计算机科学
IET Computer Vision Pub Date : 2024-01-23 DOI: 10.1049/cvi2.12272
Jing Zhou, Tengxing Lin, Zixin Gong, Xinhan Huang
{"title":"SIANet: 3D object detection with structural information augment network","authors":"Jing Zhou,&nbsp;Tengxing Lin,&nbsp;Zixin Gong,&nbsp;Xinhan Huang","doi":"10.1049/cvi2.12272","DOIUrl":"10.1049/cvi2.12272","url":null,"abstract":"<p>3D object detection technology from point clouds has been widely applied in the field of automatic driving in recent years. In practical applications, the shape point clouds of some objects are incomplete due to occlusion or far distance, which means they suffer from insufficient structural information. This greatly affects the detection performance. To address this challenge, the authors design a Structural Information Augment (SIA) Network for 3D object detection, named SIANet. Specifically, the authors design a SIA module to reconstruct the complete shapes of objects within proposals for enhancing their geometric features, which are further fused into the spatial feature of the object for box refinement to predict accurate detection boxes. Besides, the authors construct a novel Unet-liked Context-enhanced Transformer backbone network, which stacks Context-enhanced Transformer modules and an upsampling branch to capture contextual information efficiently and generate high-quality proposals for the SIA module. Extensive experiments show that the authors’ well-designed SIANet can effectively improve detection performance, especially surpassing the baseline network by 1.04% mean Average Precision (mAP) gain in the KITTI dataset and 0.75% LEVEL_2 mAP gain in the Waymo dataset.</p>","PeriodicalId":56304,"journal":{"name":"IET Computer Vision","volume":"18 5","pages":"682-695"},"PeriodicalIF":1.5,"publicationDate":"2024-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cvi2.12272","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139604878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adversarial catoptric light: An effective, stealthy and robust physical-world attack to DNNs 对抗性猫眼光:一种针对 DNN 的有效、隐蔽且强大的物理世界攻击
IF 1.5 4区 计算机科学
IET Computer Vision Pub Date : 2024-01-18 DOI: 10.1049/cvi2.12264
Chengyin Hu, Weiwen Shi, Ling Tian, Wen Li
{"title":"Adversarial catoptric light: An effective, stealthy and robust physical-world attack to DNNs","authors":"Chengyin Hu,&nbsp;Weiwen Shi,&nbsp;Ling Tian,&nbsp;Wen Li","doi":"10.1049/cvi2.12264","DOIUrl":"10.1049/cvi2.12264","url":null,"abstract":"<p>Recent studies have demonstrated that finely tuned deep neural networks (DNNs) are susceptible to adversarial attacks. Conventional physical attacks employ stickers as perturbations, achieving robust adversarial effects but compromising stealthiness. Recent innovations utilise light beams, such as lasers and projectors, for perturbation generation, allowing for stealthy physical attacks at the expense of robustness. In pursuit of implementing both stealthy and robust physical attacks, the authors present an adversarial catoptric light (AdvCL). This method leverages the natural phenomenon of catoptric light to generate perturbations that are both natural and stealthy. AdvCL first formalises the physical parameters of catoptric light and then optimises these parameters using a genetic algorithm to derive the most adversarial perturbation. Finally, the perturbations are deployed in the physical scene to execute stealthy and robust attacks. The proposed method is evaluated across three dimensions: effectiveness, stealthiness, and robustness. Quantitative results obtained in simulated environments demonstrate the efficacy of the proposed method, achieving an attack success rate of 83.5%, surpassing the baseline. The authors utilise common catoptric light as a perturbation to enhance the method's stealthiness, rendering physical samples more natural in appearance. Robustness is affirmed by successfully attacking advanced DNNs with a success rate exceeding 80% in all cases. Additionally, the authors discuss defence strategies against AdvCL and introduce some light-based physical attacks.</p>","PeriodicalId":56304,"journal":{"name":"IET Computer Vision","volume":"18 5","pages":"557-573"},"PeriodicalIF":1.5,"publicationDate":"2024-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cvi2.12264","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139614963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel multi-model 3D object detection framework with adaptive voxel-image feature fusion 自适应体素图像特征融合的新型多模型三维物体检测框架
IF 1.5 4区 计算机科学
IET Computer Vision Pub Date : 2024-01-17 DOI: 10.1049/cvi2.12269
Zhao Liu, Zhongliang Fu, Gang Li, Shengyuan Zhang
{"title":"A novel multi-model 3D object detection framework with adaptive voxel-image feature fusion","authors":"Zhao Liu,&nbsp;Zhongliang Fu,&nbsp;Gang Li,&nbsp;Shengyuan Zhang","doi":"10.1049/cvi2.12269","DOIUrl":"10.1049/cvi2.12269","url":null,"abstract":"<p>The multifaceted nature of sensor data has long been a hurdle for those seeking to harness its full potential in the field of 3D object detection. Although the utilisation of point clouds as input has yielded exceptional results, the challenge of effectively combining the complementary properties of multi-sensor data looms large. This work presents a new approach to multi-model 3D object detection, called adaptive voxel-image feature fusion (AVIFF). Adaptive voxel-image feature fusion is an end-to-end single-shot framework that can dynamically and adaptively fuse point cloud and image features, resulting in a more comprehensive and integrated analysis of the camera sensor and the LiDar sensor data. With the aid of the adaptive feature fusion module, spatialised image features can be adroitly fused with voxel-based point cloud features, while the Dense Fusion module ensures the preservation of the distinctive characteristics of 3D point cloud data through the use of a heterogeneous architecture. Notably, the authors’ framework features a novel generalised intersection over union loss function that enhances the perceptibility of object localsation and rotation in 3D space. Comprehensive experimentation has validated the efficacy of the authors’ proposed modules, firmly establishing AVIFF as a novel framework in the field of 3D object detection.</p>","PeriodicalId":56304,"journal":{"name":"IET Computer Vision","volume":"18 5","pages":"640-651"},"PeriodicalIF":1.5,"publicationDate":"2024-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cvi2.12269","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139616930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-Scale Feature Attention-DEtection TRansformer: Multi-Scale Feature Attention for security check object detection 多尺度特征关注-检测转换器:用于安全检查对象检测的多尺度特征关注
IF 1.5 4区 计算机科学
IET Computer Vision Pub Date : 2024-01-16 DOI: 10.1049/cvi2.12267
Haifeng Sima, Bailiang Chen, Chaosheng Tang, Yudong Zhang, Junding Sun
{"title":"Multi-Scale Feature Attention-DEtection TRansformer: Multi-Scale Feature Attention for security check object detection","authors":"Haifeng Sima,&nbsp;Bailiang Chen,&nbsp;Chaosheng Tang,&nbsp;Yudong Zhang,&nbsp;Junding Sun","doi":"10.1049/cvi2.12267","DOIUrl":"10.1049/cvi2.12267","url":null,"abstract":"<p>X-ray security checks aim to detect contraband in luggage; however, the detection accuracy is hindered by the overlapping and significant size differences of objects in X-ray images. To address these challenges, the authors introduce a novel network model named Multi-Scale Feature Attention (MSFA)-DEtection TRansformer (DETR). Firstly, the pyramid feature extraction structure is embedded into the self-attention module, referred to as the MSFA. Leveraging the MSFA module, MSFA-DETR extracts multi-scale feature information and amalgamates them into high-level semantic features. Subsequently, these features are synergised through attention mechanisms to capture correlations between global information and multi-scale features. MSFA significantly bolsters the model's robustness across different sizes, thereby enhancing detection accuracy. Simultaneously, A new initialisation method for object queries is proposed. The authors’ foreground sequence extraction (FSE) module extracts key feature sequences from feature maps, serving as prior knowledge for object queries. FSE expedites the convergence of the DETR model and elevates detection accuracy. Extensive experimentation validates that this proposed model surpasses state-of-the-art methods on the CLCXray and PIDray datasets.</p>","PeriodicalId":56304,"journal":{"name":"IET Computer Vision","volume":"18 5","pages":"613-625"},"PeriodicalIF":1.5,"publicationDate":"2024-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cvi2.12267","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139620312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Clean, performance-robust, and performance-sensitive historical information based adversarial self-distillation 基于历史信息的对抗性自馏分,干净、性能稳定且对性能敏感
IF 1.5 4区 计算机科学
IET Computer Vision Pub Date : 2024-01-08 DOI: 10.1049/cvi2.12265
Shuyi Li, Hongchao Hu, Shumin Huo, Hao Liang
{"title":"Clean, performance-robust, and performance-sensitive historical information based adversarial self-distillation","authors":"Shuyi Li,&nbsp;Hongchao Hu,&nbsp;Shumin Huo,&nbsp;Hao Liang","doi":"10.1049/cvi2.12265","DOIUrl":"10.1049/cvi2.12265","url":null,"abstract":"<p>Adversarial training suffers from poor effectiveness due to the challenging optimisation of loss with hard labels. To address this issue, adversarial distillation has emerged as a potential solution, encouraging target models to mimic the output of the teachers. However, reliance on pre-training teachers leads to additional training costs and raises concerns about the reliability of their knowledge. Furthermore, existing methods fail to consider the significant differences in unconfident samples between early and late stages, potentially resulting in robust overfitting. An adversarial defence method named Clean, Performance-robust, and Performance-sensitive Historical Information based Adversarial Self-Distillation (CPr &amp; PsHI-ASD) is presented. Firstly, an adversarial self-distillation replacement method based on clean, performance-robust, and performance-sensitive historical information is developed to eliminate pre-training costs and enhance guidance reliability for the target model. Secondly, adversarial self-distillation algorithms that leverage knowledge distilled from the previous iteration are introduced to facilitate the self-distillation of adversarial knowledge and mitigate the problem of robust overfitting. Experiments are conducted to evaluate the performance of the proposed method on CIFAR-10, CIFAR-100, and Tiny-ImageNet datasets. The results demonstrate that the CPr&amp;PsHI-ASD method is more effective than existing adversarial distillation methods in enhancing adversarial robustness and mitigating robust overfitting issues against various adversarial attacks.</p>","PeriodicalId":56304,"journal":{"name":"IET Computer Vision","volume":"18 5","pages":"591-612"},"PeriodicalIF":1.5,"publicationDate":"2024-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cvi2.12265","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139446540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A deep learning framework for multi-object tracking in team sports videos 团队运动视频中的多目标跟踪深度学习框架
IF 1.5 4区 计算机科学
IET Computer Vision Pub Date : 2024-01-02 DOI: 10.1049/cvi2.12266
Wei Cao, Xiaoyong Wang, Xianxiang Liu, Yishuai Xu
{"title":"A deep learning framework for multi-object tracking in team sports videos","authors":"Wei Cao,&nbsp;Xiaoyong Wang,&nbsp;Xianxiang Liu,&nbsp;Yishuai Xu","doi":"10.1049/cvi2.12266","DOIUrl":"10.1049/cvi2.12266","url":null,"abstract":"<p>In response to the challenges of Multi-Object Tracking (MOT) in sports scenes, such as severe occlusions, similar appearances, drastic pose changes, and complex motion patterns, a deep-learning framework CTGMOT (CNN-Transformer-GNN-based MOT) specifically for multiple athlete tracking in sports videos that performs joint modelling of detection, appearance and motion features is proposed. Firstly, a detection network that combines Convolutional Neural Networks (CNN) and Transformers is constructed to extract both local and global features from images. The fusion of appearance and motion features is achieved through a design of parallel dual-branch decoders. Secondly, graph models are built using Graph Neural Networks (GNN) to accurately capture the spatio-temporal correlations between object and trajectory features from inter-frame and intra-frame associations. Experimental results on the public sports tracking dataset SportsMOT show that the proposed framework outperforms other state-of-the-art methods for MOT in complex sport scenes. In addition, the proposed framework shows excellent generality on benchmark datasets MOT17 and MOT20.</p>","PeriodicalId":56304,"journal":{"name":"IET Computer Vision","volume":"18 5","pages":"574-590"},"PeriodicalIF":1.5,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cvi2.12266","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139453061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatial feature embedding for robust visual object tracking 空间特征嵌入实现稳健的视觉物体跟踪
IF 1.7 4区 计算机科学
IET Computer Vision Pub Date : 2023-12-20 DOI: 10.1049/cvi2.12263
Kang Liu, Long Liu, Shangqi Yang, Zhihao Fu
{"title":"Spatial feature embedding for robust visual object tracking","authors":"Kang Liu,&nbsp;Long Liu,&nbsp;Shangqi Yang,&nbsp;Zhihao Fu","doi":"10.1049/cvi2.12263","DOIUrl":"10.1049/cvi2.12263","url":null,"abstract":"<p>Recently, the offline-trained Siamese pipeline has drawn wide attention due to its outstanding tracking performance. However, the existing Siamese trackers utilise offline training to extract ‘universal’ features, which is insufficient to effectively distinguish between the target and fluctuating interference in embedding the information of the two branches, leading to inaccurate classification and localisation. In addition, the Siamese trackers employ a pre-defined scale for cropping the search candidate region based on the previous frame's result, which might easily introduce redundant background noise (clutter, similar objects etc.), affecting the tracker's robustness. To solve these problems, the authors propose two novel sub-network spatial employed to spatial feature embedding for robust object tracking. Specifically, the proposed spatial remapping (SRM) network enhances the feature discrepancy between target and distractor categories by online remapping, and improves the discriminant ability of the tracker on the embedding space. The MAML is used to optimise the SRM network to ensure its adaptability to complex tracking scenarios. Moreover, a temporal information proposal-guided (TPG) network that utilises a GRU model to dynamically predict the search scale based on temporal motion states to reduce potential background interference is introduced. The proposed two network is integrated into two popular trackers, namely SiamFC++ and TransT, which achieve superior performance on six challenging benchmarks, including OTB100, VOT2019, UAV123, GOT10K, TrackingNet and LaSOT, TrackingNet and LaSOT denoting them as SiamSRMC and SiamSRMT, respectively. Moreover, the proposed trackers obtain competitive tracking performance compared with the state-of-the-art trackers in the attribute of background clutter and similar object, validating the effectiveness of our method.</p>","PeriodicalId":56304,"journal":{"name":"IET Computer Vision","volume":"18 4","pages":"540-556"},"PeriodicalIF":1.7,"publicationDate":"2023-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cvi2.12263","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138954945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unsupervised image blind super resolution via real degradation feature learning 通过真实退化特征学习实现无监督图像盲超分辨率
IF 1.7 4区 计算机科学
IET Computer Vision Pub Date : 2023-12-15 DOI: 10.1049/cvi2.12262
Cheng Yang, Guanming Lu
{"title":"Unsupervised image blind super resolution via real degradation feature learning","authors":"Cheng Yang,&nbsp;Guanming Lu","doi":"10.1049/cvi2.12262","DOIUrl":"10.1049/cvi2.12262","url":null,"abstract":"<p>In recent years, many methods for image super-resolution (SR) have relied on pairs of low-resolution (LR) and high-resolution (HR) images for training, where the degradation process is predefined by bicubic downsampling. While such approaches perform well in standard benchmark tests, they often fail to accurately replicate the complexity of real-world image degradation. To address this challenge, researchers have proposed the use of unpaired image training to implicitly model the degradation process. However, there is a significant domain gap between the real-world LR and the synthetic LR images from HR, which severely degrades the SR performance. A novel unsupervised image-blind super-resolution method that exploits degradation feature-based learning for real-image super-resolution reconstruction (RDFL) is proposed. Their approach learns the degradation process from HR to LR using a generative adversarial network (GAN) and constrains the data distribution of the synthetic LR with real degraded images. The authors then encode the degraded features into a Transformer-based SR network for image super-resolution reconstruction through degradation representation learning. Extensive experiments on both synthetic and real datasets demonstrate the effectiveness and superiority of the RDFL method, which achieves visually pleasing reconstruction results.</p>","PeriodicalId":56304,"journal":{"name":"IET Computer Vision","volume":"18 4","pages":"485-498"},"PeriodicalIF":1.7,"publicationDate":"2023-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cvi2.12262","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139001043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信