Learning GAN-Based Foveated Reconstruction to Recover Perceptually Important Image Features

IF 1.9 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

ACM Transactions on Applied Perception Pub Date : 2021-08-07 DOI:10.1145/3583072

L. Surace, Marek Wernikowski, C. Tursun, K. Myszkowski, R. Mantiuk, P. Didyk

{"title":"Learning GAN-Based Foveated Reconstruction to Recover Perceptually Important Image Features","authors":"L. Surace, Marek Wernikowski, C. Tursun, K. Myszkowski, R. Mantiuk, P. Didyk","doi":"10.1145/3583072","DOIUrl":null,"url":null,"abstract":"A foveated image can be entirely reconstructed from a sparse set of samples distributed according to the retinal sensitivity of the human visual system, which rapidly decreases with increasing eccentricity. The use of generative adversarial networks (GANs) has recently been shown to be a promising solution for such a task, as they can successfully hallucinate missing image information. As in the case of other supervised learning approaches, the definition of the loss function and the training strategy heavily influence the quality of the output. In this work,we consider the problem of efficiently guiding the training of foveated reconstruction techniques such that they are more aware of the capabilities and limitations of the human visual system, and thus can reconstruct visually important image features. Our primary goal is to make the training procedure less sensitive to distortions that humans cannot detect and focus on penalizing perceptually important artifacts. Given the nature of GAN-based solutions, we focus on the sensitivity of human vision to hallucination in case of input samples with different densities. We propose psychophysical experiments, a dataset, and a procedure for training foveated image reconstruction. The proposed strategy renders the generator network flexible by penalizing only perceptually important deviations in the output. As a result, the method emphasized the recovery of perceptually important image features. We evaluated our strategy and compared it with alternative solutions by using a newly trained objective metric, a recent foveated video quality metric, and user experiments. Our evaluations revealed significant improvements in the perceived image reconstruction quality compared with the standard GAN-based training approach.","PeriodicalId":50921,"journal":{"name":"ACM Transactions on Applied Perception","volume":"20 1","pages":"1 - 23"},"PeriodicalIF":1.9000,"publicationDate":"2021-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Applied Perception","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3583072","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

A foveated image can be entirely reconstructed from a sparse set of samples distributed according to the retinal sensitivity of the human visual system, which rapidly decreases with increasing eccentricity. The use of generative adversarial networks (GANs) has recently been shown to be a promising solution for such a task, as they can successfully hallucinate missing image information. As in the case of other supervised learning approaches, the definition of the loss function and the training strategy heavily influence the quality of the output. In this work,we consider the problem of efficiently guiding the training of foveated reconstruction techniques such that they are more aware of the capabilities and limitations of the human visual system, and thus can reconstruct visually important image features. Our primary goal is to make the training procedure less sensitive to distortions that humans cannot detect and focus on penalizing perceptually important artifacts. Given the nature of GAN-based solutions, we focus on the sensitivity of human vision to hallucination in case of input samples with different densities. We propose psychophysical experiments, a dataset, and a procedure for training foveated image reconstruction. The proposed strategy renders the generator network flexible by penalizing only perceptually important deviations in the output. As a result, the method emphasized the recovery of perceptually important image features. We evaluated our strategy and compared it with alternative solutions by using a newly trained objective metric, a recent foveated video quality metric, and user experiments. Our evaluations revealed significant improvements in the perceived image reconstruction quality compared with the standard GAN-based training approach.

查看原文本刊更多论文

基于学习GAN的Fovead重建恢复感知重要图像特征

根据人类视觉系统的视网膜灵敏度分布的稀疏样本集可以完全重建凹陷图像，视网膜灵敏度随着离心率的增加而迅速降低。生成对抗性网络（GANs）的使用最近被证明是这类任务的一种很有前途的解决方案，因为它们可以成功地使丢失的图像信息产生幻觉。与其他监督学习方法一样，损失函数的定义和训练策略严重影响输出的质量。在这项工作中，我们考虑了有效指导凹陷重建技术训练的问题，使它们更了解人类视觉系统的能力和局限性，从而能够重建视觉上重要的图像特征。我们的主要目标是使训练过程对人类无法检测到的失真不那么敏感，并专注于惩罚感知上重要的人工制品。鉴于基于GAN的解决方案的性质，我们重点关注在不同密度的输入样本的情况下，人类视觉对幻觉的敏感性。我们提出了心理物理实验，一个数据集，以及一个训练凹图像重建的程序。所提出的策略通过只惩罚输出中感知到的重要偏差，使发电网络变得灵活。因此，该方法强调恢复感知上重要的图像特征。我们评估了我们的策略，并通过使用新训练的客观指标、最近的视频质量指标和用户实验将其与替代解决方案进行了比较。我们的评估显示，与基于GAN的标准训练方法相比，感知图像重建质量显著提高。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM Transactions on Applied Perception 工程技术-计算机：软件工程

CiteScore

3.70

自引率

0.00%

发文量

审稿时长

12 months

期刊介绍： ACM Transactions on Applied Perception (TAP) aims to strengthen the synergy between computer science and psychology/perception by publishing top quality papers that help to unify research in these fields. The journal publishes inter-disciplinary research of significant and lasting value in any topic area that spans both Computer Science and Perceptual Psychology. All papers must incorporate both perceptual and computer science components.