Video Super-Resolution Based on Spatial-Temporal Transformer

Minyan Zheng, Jianping Luo, Wenming Cao
{"title":"Video Super-Resolution Based on Spatial-Temporal Transformer","authors":"Minyan Zheng, Jianping Luo, Wenming Cao","doi":"10.1109/CCIS53392.2021.9754604","DOIUrl":null,"url":null,"abstract":"In this paper, we proposed a Spatial-Temporal Transformer (STTF) algorithm for video super resolution (SR), to solve the problem of blurs or artifacts after super resolve low-resolution (LR) video with traditional super resolution algorithm. Firstly, the algorithm uses residual blocks to extract initial features from video sequences. Secondly, the three-dimensional video features are decomposed into image patches and then are sent to the Spatial-Temporal Transformer network for self-attention among patches where patches can be aligned and fused. Finally, sub-pixel convolution layer and residual layers are applied to up-sampling and reconstruct the high-resolution (HR) video sequences. In order to improve video visual effects, minimum mean square error (MSE) loss function is applied to train the neural network. The experimental results show that the STTF network has a higher peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM) compared to traditional super-resolution algorithm.","PeriodicalId":191226,"journal":{"name":"2021 IEEE 7th International Conference on Cloud Computing and Intelligent Systems (CCIS)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 7th International Conference on Cloud Computing and Intelligent Systems (CCIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCIS53392.2021.9754604","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

In this paper, we proposed a Spatial-Temporal Transformer (STTF) algorithm for video super resolution (SR), to solve the problem of blurs or artifacts after super resolve low-resolution (LR) video with traditional super resolution algorithm. Firstly, the algorithm uses residual blocks to extract initial features from video sequences. Secondly, the three-dimensional video features are decomposed into image patches and then are sent to the Spatial-Temporal Transformer network for self-attention among patches where patches can be aligned and fused. Finally, sub-pixel convolution layer and residual layers are applied to up-sampling and reconstruct the high-resolution (HR) video sequences. In order to improve video visual effects, minimum mean square error (MSE) loss function is applied to train the neural network. The experimental results show that the STTF network has a higher peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM) compared to traditional super-resolution algorithm.
基于时空变换的视频超分辨率
本文提出了一种用于视频超分辨率(SR)的时空变换(STTF)算法,以解决传统超分辨率算法处理超分辨率低分辨率(LR)视频后出现模糊或伪影的问题。该算法首先利用残差块从视频序列中提取初始特征;其次,将三维视频特征分解成图像小块,送入时空变换网络进行小块间的自关注,对小块进行对齐和融合;最后,利用亚像素卷积层和残差层对高分辨率视频序列进行上采样和重构。为了提高视频的视觉效果,采用最小均方误差损失函数对神经网络进行训练。实验结果表明,与传统的超分辨算法相比,STTF网络具有更高的峰值信噪比(PSNR)和结构相似指数(SSIM)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信