{"title":"Multi-scale feature fusion network with spatial-temporal alignment for video denoising","authors":"Yushan Lv, Di Wu, Yuhang Li, Youdong Ding","doi":"10.1117/12.2667325","DOIUrl":null,"url":null,"abstract":"Most existing video denoising methods based on the PatchMatch algorithm and optical flow estimation often lead to artifacts blurring and poor denoising effect on scale-varying data. To tackle these issues, we propose a multi-scale feature fusion network based on different pyramid blocks and adaptive spatial-channel attention, which enables to effectively extract multi-scale feature information from noisy video data. Furthermore, we develop a spatial-temporal alignment module with deformable convolution to align the implicit features and reduce blurring artifacts. The results show that the proposed method outperforms the state-of-the-art algorithms in visual and objective quality metrics on the public datasets DAVIS and Set8.","PeriodicalId":128051,"journal":{"name":"Third International Seminar on Artificial Intelligence, Networking, and Information Technology","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Third International Seminar on Artificial Intelligence, Networking, and Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.2667325","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Most existing video denoising methods based on the PatchMatch algorithm and optical flow estimation often lead to artifacts blurring and poor denoising effect on scale-varying data. To tackle these issues, we propose a multi-scale feature fusion network based on different pyramid blocks and adaptive spatial-channel attention, which enables to effectively extract multi-scale feature information from noisy video data. Furthermore, we develop a spatial-temporal alignment module with deformable convolution to align the implicit features and reduce blurring artifacts. The results show that the proposed method outperforms the state-of-the-art algorithms in visual and objective quality metrics on the public datasets DAVIS and Set8.