具有可变形对齐的多尺度视频反色调映射

2020 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2020-12-01 DOI:10.1109/VCIP49819.2020.9301780

Jiaqi Zou, Ke Mei, Songlin Sun

{"title":"具有可变形对齐的多尺度视频反色调映射","authors":"Jiaqi Zou, Ke Mei, Songlin Sun","doi":"10.1109/VCIP49819.2020.9301780","DOIUrl":null,"url":null,"abstract":"Inverse tone mapping(iTM) is an operation to transform low-dynamic-range (LDR) content to high-dynamic-range (HDR) content, which is an effective technique to improve the visual experience. ITM has developed rapidly with deep learning algorithms in recent years. However, the great majority of deeplearning-based iTM methods are aimed at images and ignore the temporal correlations of consecutive frames in videos. In this paper, we propose a multi-scale video iTM network with deformable alignment, which increases time consistency in videos. We first a lign t he i nput c onsecutive L DR f rames a t t he feature level by deformable convolutions and then simultaneously use multi-frame information to generate the HDR frame. Additionally, we adopt a multi-scale iTM architecture with a pyramid pooling module, which enables our network to reconstruct details as well as global features. The proposed network achieves better performance compared to other iTM methods on quantitative metrics and gain a significant visual improvement.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Multi-Scale Video Inverse Tone Mapping with Deformable Alignment\",\"authors\":\"Jiaqi Zou, Ke Mei, Songlin Sun\",\"doi\":\"10.1109/VCIP49819.2020.9301780\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Inverse tone mapping(iTM) is an operation to transform low-dynamic-range (LDR) content to high-dynamic-range (HDR) content, which is an effective technique to improve the visual experience. ITM has developed rapidly with deep learning algorithms in recent years. However, the great majority of deeplearning-based iTM methods are aimed at images and ignore the temporal correlations of consecutive frames in videos. In this paper, we propose a multi-scale video iTM network with deformable alignment, which increases time consistency in videos. We first a lign t he i nput c onsecutive L DR f rames a t t he feature level by deformable convolutions and then simultaneously use multi-frame information to generate the HDR frame. Additionally, we adopt a multi-scale iTM architecture with a pyramid pooling module, which enables our network to reconstruct details as well as global features. The proposed network achieves better performance compared to other iTM methods on quantitative metrics and gain a significant visual improvement.\",\"PeriodicalId\":431880,\"journal\":{\"name\":\"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)\",\"volume\":\"33 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/VCIP49819.2020.9301780\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VCIP49819.2020.9301780","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

反调映射(iTM)是一种将低动态范围(LDR)内容转换为高动态范围(HDR)内容的操作，是改善视觉体验的有效技术。近年来，随着深度学习算法的发展，ITM得到了迅速发展。然而，绝大多数基于深度学习的iTM方法都是针对图像的，忽略了视频中连续帧的时间相关性。本文提出了一种具有可变形对齐的多尺度视频iTM网络，提高了视频的时间一致性。我们首先通过可变形卷积将输入的3个连续的L - DR帧线性化到特征层，然后同时使用多帧信息生成HDR帧。此外，我们采用了多尺度iTM架构和金字塔池模块，使我们的网络能够重建细节和全局特征。与其他iTM方法相比，本文提出的网络在定量指标上取得了更好的性能，并且在视觉上有了显著的改善。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Multi-Scale Video Inverse Tone Mapping with Deformable Alignment

Inverse tone mapping(iTM) is an operation to transform low-dynamic-range (LDR) content to high-dynamic-range (HDR) content, which is an effective technique to improve the visual experience. ITM has developed rapidly with deep learning algorithms in recent years. However, the great majority of deeplearning-based iTM methods are aimed at images and ignore the temporal correlations of consecutive frames in videos. In this paper, we propose a multi-scale video iTM network with deformable alignment, which increases time consistency in videos. We first a lign t he i nput c onsecutive L DR f rames a t t he feature level by deformable convolutions and then simultaneously use multi-frame information to generate the HDR frame. Additionally, we adopt a multi-scale iTM architecture with a pyramid pooling module, which enables our network to reconstruct details as well as global features. The proposed network achieves better performance compared to other iTM methods on quantitative metrics and gain a significant visual improvement.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)

自引率

0.00%

发文量