A Practical Gated Recurrent Transformer Network Incorporating Multiple Fusions for Video Denoising

Kai Guo, Seungwon Choi, Jongseong Choi, Lae-Hoon Kim
{"title":"A Practical Gated Recurrent Transformer Network Incorporating Multiple Fusions for Video Denoising","authors":"Kai Guo, Seungwon Choi, Jongseong Choi, Lae-Hoon Kim","doi":"arxiv-2409.06603","DOIUrl":null,"url":null,"abstract":"State-of-the-art (SOTA) video denoising methods employ multi-frame\nsimultaneous denoising mechanisms, resulting in significant delays (e.g., 16\nframes), making them impractical for real-time cameras. To overcome this\nlimitation, we propose a multi-fusion gated recurrent Transformer network\n(GRTN) that achieves SOTA denoising performance with only a single-frame delay.\nSpecifically, the spatial denoising module extracts features from the current\nframe, while the reset gate selects relevant information from the previous\nframe and fuses it with current frame features via the temporal denoising\nmodule. The update gate then further blends this result with the previous frame\nfeatures, and the reconstruction module integrates it with the current frame.\nTo robustly compute attention for noisy features, we propose a residual\nsimplified Swin Transformer with Euclidean distance (RSSTE) in the spatial and\ntemporal denoising modules. Comparative objective and subjective results show\nthat our GRTN achieves denoising performance comparable to SOTA multi-frame\ndelay networks, with only a single-frame delay.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":"56 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Image and Video Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.06603","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

State-of-the-art (SOTA) video denoising methods employ multi-frame simultaneous denoising mechanisms, resulting in significant delays (e.g., 16 frames), making them impractical for real-time cameras. To overcome this limitation, we propose a multi-fusion gated recurrent Transformer network (GRTN) that achieves SOTA denoising performance with only a single-frame delay. Specifically, the spatial denoising module extracts features from the current frame, while the reset gate selects relevant information from the previous frame and fuses it with current frame features via the temporal denoising module. The update gate then further blends this result with the previous frame features, and the reconstruction module integrates it with the current frame. To robustly compute attention for noisy features, we propose a residual simplified Swin Transformer with Euclidean distance (RSSTE) in the spatial and temporal denoising modules. Comparative objective and subjective results show that our GRTN achieves denoising performance comparable to SOTA multi-frame delay networks, with only a single-frame delay.
用于视频去噪的包含多重融合的实用门控循环变压器网络
最先进的(SOTA)视频去噪方法采用了多帧同时去噪机制,导致显著的延迟(例如 16 帧),使其不适用于实时摄像机。为了克服这一限制,我们提出了一种多融合门控递归变换器网络(GRTN),它只需单帧延迟就能实现 SOTA 去噪性能。具体来说,空间去噪模块从当前帧中提取特征,而重置门则从先前帧中选择相关信息,并通过时间去噪模块将其与当前帧特征融合。为了稳健地计算噪声特征的关注度,我们在空间和时间去噪模块中提出了带欧氏距离的残差简化斯文变换器(RSSTE)。客观和主观的比较结果表明,我们的 GRTN 在仅有单帧延迟的情况下实现了与 SOTA 多帧延迟网络相当的去噪性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信