Multi-Stage Raw Video Denoising with Adversarial Loss and Gradient Mask

2021 IEEE International Conference on Computational Photography (ICCP) Pub Date : 2021-03-04 DOI:10.1109/ICCP51581.2021.9466268

Avinash Paliwal, Libing Zeng, N. Kalantari

{"title":"Multi-Stage Raw Video Denoising with Adversarial Loss and Gradient Mask","authors":"Avinash Paliwal, Libing Zeng, N. Kalantari","doi":"10.1109/ICCP51581.2021.9466268","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a learning-based approach for denoising raw videos captured under low lighting conditions. We propose to do this by first explicitly aligning the neighboring frames to the current frame using a convolutional neural network (CNN). We then fuse the registered frames using another CNN to obtain the final denoised frame. To avoid directly aligning the temporally distant frames, we perform the two processes of alignment and fusion in multiple stages. Specifically, at each stage, we perform the denoising process on three consecutive input frames to generate the intermediate denoised frames which are then passed as the input to the next stage. By performing the process in multiple stages, we can effectively utilize the information of neighboring frames without directly aligning the temporally distant frames. We train our multi-stage system using an adversarial loss with a conditional discriminator. Specifically, we condition the discriminator on a soft gradient mask to prevent introducing high-frequency artifacts in smooth regions. We show that our system is able to produce temporally coherent videos with realistic details. Furthermore, we demonstrate through extensive experiments that our approach outperforms state-of-the-art image and video denoising methods both numerically and visually.","PeriodicalId":132124,"journal":{"name":"2021 IEEE International Conference on Computational Photography (ICCP)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Computational Photography (ICCP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCP51581.2021.9466268","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

In this paper, we propose a learning-based approach for denoising raw videos captured under low lighting conditions. We propose to do this by first explicitly aligning the neighboring frames to the current frame using a convolutional neural network (CNN). We then fuse the registered frames using another CNN to obtain the final denoised frame. To avoid directly aligning the temporally distant frames, we perform the two processes of alignment and fusion in multiple stages. Specifically, at each stage, we perform the denoising process on three consecutive input frames to generate the intermediate denoised frames which are then passed as the input to the next stage. By performing the process in multiple stages, we can effectively utilize the information of neighboring frames without directly aligning the temporally distant frames. We train our multi-stage system using an adversarial loss with a conditional discriminator. Specifically, we condition the discriminator on a soft gradient mask to prevent introducing high-frequency artifacts in smooth regions. We show that our system is able to produce temporally coherent videos with realistic details. Furthermore, we demonstrate through extensive experiments that our approach outperforms state-of-the-art image and video denoising methods both numerically and visually.

查看原文本刊更多论文

基于对抗性损失和梯度蒙版的多阶段原始视频去噪

在本文中，我们提出了一种基于学习的方法来去噪在低光照条件下捕获的原始视频。我们建议通过首先使用卷积神经网络(CNN)显式地将相邻帧对齐到当前帧来实现这一点。然后，我们使用另一个CNN融合注册帧以获得最终的去噪帧。为了避免直接对齐时间间隔较远的帧，我们在多个阶段进行对齐和融合两个过程。具体来说，在每个阶段，我们对三个连续的输入帧执行去噪过程，以生成中间去噪帧，然后将其作为输入传递到下一阶段。通过分阶段进行处理，我们可以有效地利用相邻帧的信息，而不必直接对齐暂时距离较远的帧。我们使用带有条件鉴别器的对抗损失来训练我们的多阶段系统。具体来说，我们将鉴别器设置在一个软梯度掩模上，以防止在光滑区域引入高频伪影。我们表明，我们的系统能够产生具有现实细节的时间连贯视频。此外，我们通过大量的实验证明，我们的方法在数值和视觉上都优于最先进的图像和视频去噪方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE International Conference on Computational Photography (ICCP)

自引率

0.00%

发文量