Super-Resolution Generator Networks: A comparative study

2018 IEEE 14th International Conference on Intelligent Computer Communication and Processing (ICCP) Pub Date : 2018-09-01 DOI:10.1109/ICCP.2018.8516603

C. Lungu, R. Potolea

{"title":"Super-Resolution Generator Networks: A comparative study","authors":"C. Lungu, R. Potolea","doi":"10.1109/ICCP.2018.8516603","DOIUrl":null,"url":null,"abstract":"Modern approaches that tackle super-resolution aim to train a generator network that transforms the low resolution image into a higher resolution one. The core learning capacity of these generator networks is given by stacks of well known image processing blocks such as VGG-16 [SZ14], ResNet[HZRS15] or Inception-v3 [SVI $^{+15]}$ blocks. In the light of recent advancements on the CIFAR-10 [KNH] benchmarks where DenseNet [HLW16] and later SparseNet [ZDD $^{+18]}$ proved superior performance over the architectures that used the formerly mentioned blocks, this paper aims to do a comparative study on the performance changes resulting when using DenseNet or SparseNet blocks in generator networks. We first replicate the results of [JAL16]. This work describes a generator network that uses a stack of four ResNet blocks. This stack is incorporated in two architectures for superresolution, one for x4 magnification and another one for x8. We then proceed and substitute them with DenseNet blocks and SparseNet blocks but keep the same overall training procedure. In order to ensure a fair comparison we adapt the number of blocks for each architecture in order to match the same amount of parameters on all architectures. In all cases the same optimization loss function is used, perceptual loss [JAL16], which for a given image yields a value that is a weighted sum of mean-squared-errors between filters of the target input and generated image evaluated on equivalent convolution layers of the last three blocks in the VGG-16 network (pretrained on the ImageNet [DDS $^{+09]}$ dataset). We monitor on all architectures the loss value, the number of epochs needed to reach the lowest loss, the artifacts generated by each network and the overall appearance of the reconstructions.","PeriodicalId":259007,"journal":{"name":"2018 IEEE 14th International Conference on Intelligent Computer Communication and Processing (ICCP)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 14th International Conference on Intelligent Computer Communication and Processing (ICCP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCP.2018.8516603","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Modern approaches that tackle super-resolution aim to train a generator network that transforms the low resolution image into a higher resolution one. The core learning capacity of these generator networks is given by stacks of well known image processing blocks such as VGG-16 [SZ14], ResNet[HZRS15] or Inception-v3 [SVI $^{+15]}$ blocks. In the light of recent advancements on the CIFAR-10 [KNH] benchmarks where DenseNet [HLW16] and later SparseNet [ZDD $^{+18]}$ proved superior performance over the architectures that used the formerly mentioned blocks, this paper aims to do a comparative study on the performance changes resulting when using DenseNet or SparseNet blocks in generator networks. We first replicate the results of [JAL16]. This work describes a generator network that uses a stack of four ResNet blocks. This stack is incorporated in two architectures for superresolution, one for x4 magnification and another one for x8. We then proceed and substitute them with DenseNet blocks and SparseNet blocks but keep the same overall training procedure. In order to ensure a fair comparison we adapt the number of blocks for each architecture in order to match the same amount of parameters on all architectures. In all cases the same optimization loss function is used, perceptual loss [JAL16], which for a given image yields a value that is a weighted sum of mean-squared-errors between filters of the target input and generated image evaluated on equivalent convolution layers of the last three blocks in the VGG-16 network (pretrained on the ImageNet [DDS $^{+09]}$ dataset). We monitor on all architectures the loss value, the number of epochs needed to reach the lowest loss, the artifacts generated by each network and the overall appearance of the reconstructions.

查看原文本刊更多论文

超分辨率生成器网络的比较研究

处理超分辨率的现代方法旨在训练一个生成器网络，将低分辨率图像转换为高分辨率图像。这些生成器网络的核心学习能力是由VGG-16 [SZ14]， ResNet[HZRS15]或Inception-v3 [SVI $^{+15]}$块等众所周知的图像处理块堆栈提供的。鉴于CIFAR-10 [KNH]基准测试的最新进展，其中DenseNet [HLW16]和后来的SparseNet [ZDD $^{+18]}$证明了比使用先前提到的块的架构更优越的性能，本文旨在对在生成器网络中使用DenseNet或SparseNet块时产生的性能变化进行比较研究。我们首先重复了[JAL16]的结果。这项工作描述了一个使用四个ResNet块堆栈的生成器网络。该堆栈包含在两个超分辨率架构中，一个用于x4放大，另一个用于x8放大。然后，我们继续用DenseNet块和SparseNet块替换它们，但保持相同的整体训练过程。为了确保公平的比较，我们调整了每个架构的块数量，以便在所有架构上匹配相同数量的参数。在所有情况下都使用相同的优化损失函数，即感知损失[JAL16]，对于给定的图像，它产生的值是目标输入滤波器和生成图像之间的均方误差的加权和，该值是在vug -16网络(在ImageNet [DDS $^{+09]}$数据集上进行预训练)的最后三个块的等效卷积层上评估的。我们监控所有架构上的损失值、达到最低损失所需的epoch数、每个网络生成的工件以及重建的整体外观。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2018 IEEE 14th International Conference on Intelligent Computer Communication and Processing (ICCP)

自引率

0.00%

发文量