Signal Processing-Image Communication最新文献

筛选
英文 中文
Globally and locally optimized Pannini projection for high FoV rendering of 360° images 全局和局部优化的潘尼尼投影,用于 360° 图像的高 FoV 渲染
IF 3.4 3区 工程技术
Signal Processing-Image Communication Pub Date : 2024-08-30 DOI: 10.1016/j.image.2024.117190
Falah Jabar, João Ascenso, Maria Paula Queluz
{"title":"Globally and locally optimized Pannini projection for high FoV rendering of 360° images","authors":"Falah Jabar,&nbsp;João Ascenso,&nbsp;Maria Paula Queluz","doi":"10.1016/j.image.2024.117190","DOIUrl":"10.1016/j.image.2024.117190","url":null,"abstract":"<div><p>To render a spherical (360° or omnidirectional) image on planar displays, a 2D image - called as viewport - must be obtained by projecting a sphere region on a plane, according to the user's viewing direction and a predefined field of view (FoV). However, any sphere to plan projection introduces geometric distortions, such as object stretching and/or bending of straight lines, which intensity increases with the considered FoV. In this paper, a fully automatic content-aware projection is proposed, aiming to reduce the geometric distortions when high FoVs are used. This new projection is based on the Pannini projection, whose parameters are firstly globally optimized according to the image content, followed by a local conformality improvement of relevant viewport objects. A crowdsourcing subjective test showed that the proposed projection is the most preferred solution among the considered state-of-the-art sphere to plan projections, producing viewports with a more pleasant visual quality.</p></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"129 ","pages":"Article 117190"},"PeriodicalIF":3.4,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0923596524000912/pdfft?md5=1ff2da4c676f5e3a19cdbe6c4c5f6989&pid=1-s2.0-S0923596524000912-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142136394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prototype-wise self-knowledge distillation for few-shot segmentation 以原型为导向,提炼自我知识,进行少量细分
IF 3.4 3区 工程技术
Signal Processing-Image Communication Pub Date : 2024-08-21 DOI: 10.1016/j.image.2024.117186
Yadang Chen , Xinyu Xu , Chenchen Wei , Chuhan Lu
{"title":"Prototype-wise self-knowledge distillation for few-shot segmentation","authors":"Yadang Chen ,&nbsp;Xinyu Xu ,&nbsp;Chenchen Wei ,&nbsp;Chuhan Lu","doi":"10.1016/j.image.2024.117186","DOIUrl":"10.1016/j.image.2024.117186","url":null,"abstract":"<div><p>Few-shot segmentation was proposed to obtain segmentation results for a image with an unseen class by referring to a few labeled samples. However, due to the limited number of samples, many few-shot segmentation models suffer from poor generalization. Prototypical network-based few-shot segmentation still has issues with spatial inconsistency and prototype bias. Since the target class has different appearance in each image, some specific features in the prototypes generated from the support image and its mask do not accurately reflect the generalized features of the target class. To address the support prototype consistency issue, we put forward two modules: Data Augmentation Self-knowledge Distillation (DASKD) and Prototype-wise Regularization (PWR). The DASKD module focuses on enhancing spatial consistency by using data augmentation and self-knowledge distillation. Self-knowledge distillation helps the model acquire generalized features of the target class and learn hidden knowledge from the support images. The PWR module focuses on obtaining a more representative support prototype by conducting prototype-level loss to obtain support prototypes closer to the category center. Broad evaluation experiments on PASCAL-<span><math><msup><mrow><mn>5</mn></mrow><mrow><mi>i</mi></mrow></msup></math></span> and COCO-<span><math><mrow><mn>2</mn><msup><mrow><mn>0</mn></mrow><mrow><mi>i</mi></mrow></msup></mrow></math></span> demonstrate that our model outperforms the prior works on few-shot segmentation. Our approach surpasses the state of the art by 7.5% in PASCAL-<span><math><msup><mrow><mn>5</mn></mrow><mrow><mi>i</mi></mrow></msup></math></span> and 4.2% in COCO-<span><math><mrow><mn>2</mn><msup><mrow><mn>0</mn></mrow><mrow><mi>i</mi></mrow></msup></mrow></math></span>.</p></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"129 ","pages":"Article 117186"},"PeriodicalIF":3.4,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142077049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transformer-CNN for small image object detection 用于小图像对象检测的变换器-CNN
IF 3.4 3区 工程技术
Signal Processing-Image Communication Pub Date : 2024-08-21 DOI: 10.1016/j.image.2024.117194
Yan-Lin Chen , Chun-Liang Lin , Yu-Chen Lin , Tzu-Chun Chen
{"title":"Transformer-CNN for small image object detection","authors":"Yan-Lin Chen ,&nbsp;Chun-Liang Lin ,&nbsp;Yu-Chen Lin ,&nbsp;Tzu-Chun Chen","doi":"10.1016/j.image.2024.117194","DOIUrl":"10.1016/j.image.2024.117194","url":null,"abstract":"<div><p>Object recognition in computer vision technology has been a popular research field in recent years. Although the detection success rate of regular objects has achieved impressive results, small object detection (SOD) is still a challenging issue. In the Microsoft Common Objects in Context (MS COCO) public dataset, the detection rate of small objects is typically half that of regular-sized objects. The main reason is that small objects are often affected by multi-layer convolution and pooling, leading to insufficient details to distinguish them from the background or similar objects, resulting in poor recognition rates or even no results. This paper presents a network architecture, Transformer-CNN, that combines a self-attention mechanism-based transformer and a convolutional neural network (CNN) to improve the recognition rate of SOD. It captures global information through a transformer and uses the translation invariance and translation equivalence of CNN to maximize the retention of global and local features while improving the reliability and robustness of SOD. Our experiments show that the proposed model improves the small object recognition rate by 2∼5 % than the general transformer architectures.</p></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"129 ","pages":"Article 117194"},"PeriodicalIF":3.4,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142044684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Feature extractor optimization for discriminative representations in Generalized Category Discovery 优化特征提取器,实现广义类别发现中的判别表征
IF 3.4 3区 工程技术
Signal Processing-Image Communication Pub Date : 2024-08-17 DOI: 10.1016/j.image.2024.117195
Zhonghao Chang, Xiao Li, Zihao Zhao
{"title":"Feature extractor optimization for discriminative representations in Generalized Category Discovery","authors":"Zhonghao Chang,&nbsp;Xiao Li,&nbsp;Zihao Zhao","doi":"10.1016/j.image.2024.117195","DOIUrl":"10.1016/j.image.2024.117195","url":null,"abstract":"<div><p>Generalized Category Discovery (GCD) task involves transferring knowledge from labeled known categories to recognize both known and novel categories within an unlabeled dataset. A significant challenge arises from the lack of prior information for novel categories. To address this, we develop a feature extractor that can learn discriminative features for both known and novel categories. Our approach leverages the observation that similar samples often belong to the same class. We construct a similarity matrix and employ similarity contrastive loss to increase the similarity between similar samples in the feature space. Additionally, we incorporate cluster labels to further refine the feature extractor, utilizing K-means clustering to assign these labels to unlabeled data, providing valuable supervision. Our feature extractor is optimized through the utilization of instance-level contrastive learning and class-level contrastive learning constraints. These constraints promote similarity maximization in both the instance space and the label space for instances sharing the same pseudo-labels. These three components complement each other, facilitating the learning of discriminative representations for both known and novel categories. Through comprehensive evaluations of generic image recognition datasets and challenging fine-grained datasets, we demonstrate that our proposed method achieves state-of-the-art performance.</p></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"129 ","pages":"Article 117195"},"PeriodicalIF":3.4,"publicationDate":"2024-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142020716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Image-based virtual try-on: Fidelity and simplification 基于图像的虚拟试穿:保真和简化
IF 3.4 3区 工程技术
Signal Processing-Image Communication Pub Date : 2024-08-16 DOI: 10.1016/j.image.2024.117189
Tasin Islam, Alina Miron, Xiaohui Liu, Yongmin Li
{"title":"Image-based virtual try-on: Fidelity and simplification","authors":"Tasin Islam,&nbsp;Alina Miron,&nbsp;Xiaohui Liu,&nbsp;Yongmin Li","doi":"10.1016/j.image.2024.117189","DOIUrl":"10.1016/j.image.2024.117189","url":null,"abstract":"<div><p>We introduce a novel image-based virtual try-on model designed to replace a candidate’s garment with a desired target item. The proposed model comprises three modules: segmentation, garment warping, and candidate-clothing fusion. Previous methods have shown limitations in cases involving significant differences between the original and target clothing, as well as substantial overlapping of body parts. Our model addresses these limitations by employing two key strategies. Firstly, it utilises a candidate representation based on an RGB skeleton image to enhance spatial relationships among body parts, resulting in robust segmentation and improved occlusion handling. Secondly, truncated U-Net is employed in both the segmentation and warping modules, enhancing segmentation performance and accelerating the try-on process. The warping module leverages an efficient affine transform for ease of training. Comparative evaluations against state-of-the-art models demonstrate the competitive performance of our proposed model across various scenarios, particularly excelling in handling occlusion cases and significant differences in clothing cases. This research presents a promising solution for image-based virtual try-on, advancing the field by overcoming key limitations and achieving superior performance.</p></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"129 ","pages":"Article 117189"},"PeriodicalIF":3.4,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0923596524000900/pdfft?md5=d7b74bcca8966cd1d3e0e38fa30c8482&pid=1-s2.0-S0923596524000900-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142049928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Duration-aware and mode-aware micro-expression spotting for long video sequences 针对长视频序列的时长感知和模式感知微表情定位
IF 3.4 3区 工程技术
Signal Processing-Image Communication Pub Date : 2024-08-10 DOI: 10.1016/j.image.2024.117192
Jing Liu , Xin Li , Jiaqi Zhang , Guangtao Zhai , Yuting Su , Yuyi Zhang , Bo Wang
{"title":"Duration-aware and mode-aware micro-expression spotting for long video sequences","authors":"Jing Liu ,&nbsp;Xin Li ,&nbsp;Jiaqi Zhang ,&nbsp;Guangtao Zhai ,&nbsp;Yuting Su ,&nbsp;Yuyi Zhang ,&nbsp;Bo Wang","doi":"10.1016/j.image.2024.117192","DOIUrl":"10.1016/j.image.2024.117192","url":null,"abstract":"<div><p>Micro-expressions (MEs) are unconscious, instant and slight facial movements, revealing people’s true emotions. Locating MEs is a prerequisite of classifying them, while only a few researches focus on this task. Among them, sliding window based methods are the most prevalent. Due to the differences of individual physiological and psychological mechanisms, and some uncontrollable factors, the durations and transition modes of different MEs fluctuate greatly. Limited to fixed window scale and mode, traditional sliding window based ME spotting methods fail to capture the motion changes of all MEs exactly, resulting in performance degradation. In this paper, an ensemble learning based duration &amp; mode-aware (DMA) ME spotting framework is proposed. Specifically, we exploit multiple sliding windows of different scales and modes to generate multiple weak detectors, each of which accommodates to MEs with certain duration and transition mode. Additionally, to get a more comprehensive strong detector, we integrate the analysis results of multiple weak detectors using a voting based aggregation module. Furthermore, a novel interval generation scheme is designed to merge close peaks and their neighbor frames into a complete ME interval. Experimental results on two long video databases show the promising performance of our proposed DMA framework compared with state-of-the-art methods. The codes are available at <span><span>https://github.com/TJUMMG/DMA-ME-Spotting</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"129 ","pages":"Article 117192"},"PeriodicalIF":3.4,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142049929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Low-rank tensor completion based on tensor train rank with partially overlapped sub-blocks and total variation 基于具有部分重叠子块和总变化的张量列车等级的低等级张量补全
IF 3.4 3区 工程技术
Signal Processing-Image Communication Pub Date : 2024-08-10 DOI: 10.1016/j.image.2024.117193
Jingfei He, Zezhong Yang, Xunan Zheng, Xiaoyue Zhang, Ao Li
{"title":"Low-rank tensor completion based on tensor train rank with partially overlapped sub-blocks and total variation","authors":"Jingfei He,&nbsp;Zezhong Yang,&nbsp;Xunan Zheng,&nbsp;Xiaoyue Zhang,&nbsp;Ao Li","doi":"10.1016/j.image.2024.117193","DOIUrl":"10.1016/j.image.2024.117193","url":null,"abstract":"<div><p>Recently, the low-rank tensor completion method based on tensor train (TT) rank has achieved promising performance. Ket augmentation (KA) is commonly used in TT rank-based methods to improve the performance by converting low-dimensional tensors to higher-dimensional tensors. However, block artifacts are caused since KA also destroys the original structure and image continuity of original low-dimensional tensors. To tackle this issue, a low-rank tensor completion method based on TT rank with tensor augmentation by partially overlapped sub-blocks (TAPOS) and total variation (TV) is proposed in this paper. The proposed TAPOS preserves the image continuity of the original tensor and enhances the low-rankness of the generated higher-dimensional tensors, and a weighted de-augmentation method is used to assign different weights to the elements of sub-blocks and further reduce the block artifacts. To further alleviate the block artifacts and improve reconstruction accuracy, TV is introduced in the TAPOS-based model to add the piecewise smooth prior. The parallel matrix decomposition method is introduced to estimate the TT rank to reduce the computational cost. Numerical experiments show that the proposed method outperforms the existing state-of-the-art methods.</p></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"129 ","pages":"Article 117193"},"PeriodicalIF":3.4,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142049912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HDR-ChipQA: No-reference quality assessment on High Dynamic Range videos HDR-ChipQA:高动态范围视频的无参考质量评估
IF 3.4 3区 工程技术
Signal Processing-Image Communication Pub Date : 2024-08-10 DOI: 10.1016/j.image.2024.117191
Joshua P. Ebenezer , Zaixi Shang , Yongjun Wu , Hai Wei , Sriram Sethuraman , Alan C. Bovik
{"title":"HDR-ChipQA: No-reference quality assessment on High Dynamic Range videos","authors":"Joshua P. Ebenezer ,&nbsp;Zaixi Shang ,&nbsp;Yongjun Wu ,&nbsp;Hai Wei ,&nbsp;Sriram Sethuraman ,&nbsp;Alan C. Bovik","doi":"10.1016/j.image.2024.117191","DOIUrl":"10.1016/j.image.2024.117191","url":null,"abstract":"<div><p>We present a no-reference video quality model and algorithm that delivers standout performance for High Dynamic Range (HDR) videos, which we call HDR-ChipQA. HDR videos represent wider ranges of luminances, details, and colors than Standard Dynamic Range (SDR) videos. The growing adoption of HDR in massively scaled video networks has driven the need for video quality assessment (VQA) algorithms that better account for distortions on HDR content. In particular, standard VQA models may fail to capture conspicuous distortions at the extreme ends of the dynamic range, because the features that drive them may be dominated by distortions that pervade the mid-ranges of the signal. We introduce a new approach whereby a local expansive nonlinearity emphasizes distortions occurring at the higher and lower ends of the local luma range, allowing for the definition of additional quality-aware features that are computed along a separate path. These features are not HDR-specific, and also improve VQA on SDR video contents, albeit to a reduced degree. We show that this preprocessing step significantly boosts the power of distortion-sensitive natural video statistics (NVS) features when used to predict the quality of HDR content. In similar manner, we separately compute novel wide-gamut color features using the same nonlinear processing steps. We have found that our model significantly outperforms SDR VQA algorithms on the only publicly available, comprehensive HDR database, while also attaining state-of-the-art performance on SDR content.</p></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"129 ","pages":"Article 117191"},"PeriodicalIF":3.4,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142049930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A virtual-reality spatial matching algorithm and its application on equipment maintenance support: System design and user study 虚拟现实空间匹配算法及其在设备维护支持中的应用:系统设计和用户研究
IF 3.4 3区 工程技术
Signal Processing-Image Communication Pub Date : 2024-08-10 DOI: 10.1016/j.image.2024.117188
Xiao Yang , Fanghao Huang , Jiacheng Jiang , Zheng Chen
{"title":"A virtual-reality spatial matching algorithm and its application on equipment maintenance support: System design and user study","authors":"Xiao Yang ,&nbsp;Fanghao Huang ,&nbsp;Jiacheng Jiang ,&nbsp;Zheng Chen","doi":"10.1016/j.image.2024.117188","DOIUrl":"10.1016/j.image.2024.117188","url":null,"abstract":"<div><p>Equipment maintenance support is an important technical measure to maintain the equipment’s expected performance. However, the current maintenance supports are mainly completed by maintainers under the guidance of technical manual or additional experts, which may be insufficient for some advanced equipment with rapid update rate and complex inner structure. The rising technology of augmented reality (AR) provides a new solution for equipment maintenance support, while one of the key issues limiting the practical application of AR in maintenance field is the spatial matching issue between virtual space and reality space. In this paper, a virtual-reality spatial matching algorithm is designed to accurately superimpose the virtual information to the corresponding actual scene on the AR glasses. In this algorithm, two methods are proposed to help achieve the stable matching of virtual space and reality space. In detail, to obtain the saliency map with less background interference and improved saliency detection accuracy, a saliency detection method is designed based on the super-pixel segmentation. To deal with the problems of uneven distribution on the feature points and weak robustness to the light changes, a feature extraction and matching method is proposed for acquiring the feature point matching set with the utilization of the obtained saliency map. Finally, an immersive equipment maintenance support system (IEMSS) is developed based on this spatial matching algorithm, which provides the maintainers with immediate and immersive guidance to improve the efficiency and safety in the maintenance task, as well as offers maintenance training for inexperienced maintainers with expanded virtual information in case of limited experts. Several comparative experiments are implemented to verify the effectiveness of proposed methods, and a user study of real system application is carried out to further evaluate the superiority of these methods when applied in the IEMSS.</p></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"129 ","pages":"Article 117188"},"PeriodicalIF":3.4,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141998380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A ‘deep’ review of video super-resolution 视频超分辨率的 "深度 "回顾
IF 3.4 3区 工程技术
Signal Processing-Image Communication Pub Date : 2024-07-27 DOI: 10.1016/j.image.2024.117175
Subhadra Gopalakrishnan, Anustup Choudhury
{"title":"A ‘deep’ review of video super-resolution","authors":"Subhadra Gopalakrishnan,&nbsp;Anustup Choudhury","doi":"10.1016/j.image.2024.117175","DOIUrl":"10.1016/j.image.2024.117175","url":null,"abstract":"<div><p>Video super-resolution (VSR) is an ill-posed inverse problem where the goal is to obtain high-resolution video content from a low-resolution counterpart. In this survey, we trace the history of video super-resolution techniques beginning with traditional methods, showing the evolution towards techniques that use shallow networks and finally, the recent trends where deep learning algorithms result in state-of-the-art performance. Specifically, we consider 60 neural network-based VSR techniques in addition to 8 traditional VSR techniques. We extensively cover the deep learning-based techniques including the latest models and introduce a novel taxonomy depending on their architecture. We discuss the pros and cons of each category of techniques. We consider the various components of the problem including the choice of loss functions, evaluation criteria and the benchmark datasets used for evaluation. We present a comparison of the existing techniques using common datasets, providing insights into the relative rankings of these methods. We compare the network architectures based on their computation speed and the network complexity. We also discuss the limitations of existing loss functions and the evaluation criteria that are currently used and propose alternate suggestions. Finally, we identify some of the current challenges and provide future research directions towards video super-resolution, thus providing a comprehensive understanding of the problem.</p></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"129 ","pages":"Article 117175"},"PeriodicalIF":3.4,"publicationDate":"2024-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141852723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信