学习基于gan的注视点重建以恢复感知上重要的图像特征

IF 1.9 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

ACM Transactions on Applied Perception Pub Date : 2023-04-21 DOI:https://dl.acm.org/doi/10.1145/3583072

Luca Surace, Marek Wernikowski, Cara Tursun, Karol Myszkowski, Radosław Mantiuk, Piotr Didyk

{"title":"学习基于gan的注视点重建以恢复感知上重要的图像特征","authors":"Luca Surace, Marek Wernikowski, Cara Tursun, Karol Myszkowski, Radosław Mantiuk, Piotr Didyk","doi":"https://dl.acm.org/doi/10.1145/3583072","DOIUrl":null,"url":null,"abstract":"<p>A foveated image can be entirely reconstructed from a sparse set of samples distributed according to the retinal sensitivity of the human visual system, which rapidly decreases with increasing eccentricity. The use of generative adversarial networks (GANs) has recently been shown to be a promising solution for such a task, as they can successfully hallucinate missing image information. As in the case of other supervised learning approaches, the definition of the loss function and the training strategy heavily influence the quality of the output. In this work,we consider the problem of efficiently guiding the training of foveated reconstruction techniques such that they are more aware of the capabilities and limitations of the human visual system, and thus can reconstruct visually important image features. Our primary goal is to make the training procedure less sensitive to distortions that humans cannot detect and focus on penalizing perceptually important artifacts. Given the nature of GAN-based solutions, we focus on the sensitivity of human vision to hallucination in case of input samples with different densities. We propose psychophysical experiments, a dataset, and a procedure for training foveated image reconstruction. The proposed strategy renders the generator network flexible by penalizing only perceptually important deviations in the output. As a result, the method emphasized the recovery of perceptually important image features. We evaluated our strategy and compared it with alternative solutions by using a newly trained objective metric, a recent foveated video quality metric, and user experiments. Our evaluations revealed significant improvements in the perceived image reconstruction quality compared with the standard GAN-based training approach.</p>","PeriodicalId":50921,"journal":{"name":"ACM Transactions on Applied Perception","volume":"9 3","pages":""},"PeriodicalIF":1.9000,"publicationDate":"2023-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Learning GAN-Based Foveated Reconstruction to Recover Perceptually Important Image Features\",\"authors\":\"Luca Surace, Marek Wernikowski, Cara Tursun, Karol Myszkowski, Radosław Mantiuk, Piotr Didyk\",\"doi\":\"https://dl.acm.org/doi/10.1145/3583072\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>A foveated image can be entirely reconstructed from a sparse set of samples distributed according to the retinal sensitivity of the human visual system, which rapidly decreases with increasing eccentricity. The use of generative adversarial networks (GANs) has recently been shown to be a promising solution for such a task, as they can successfully hallucinate missing image information. As in the case of other supervised learning approaches, the definition of the loss function and the training strategy heavily influence the quality of the output. In this work,we consider the problem of efficiently guiding the training of foveated reconstruction techniques such that they are more aware of the capabilities and limitations of the human visual system, and thus can reconstruct visually important image features. Our primary goal is to make the training procedure less sensitive to distortions that humans cannot detect and focus on penalizing perceptually important artifacts. Given the nature of GAN-based solutions, we focus on the sensitivity of human vision to hallucination in case of input samples with different densities. We propose psychophysical experiments, a dataset, and a procedure for training foveated image reconstruction. The proposed strategy renders the generator network flexible by penalizing only perceptually important deviations in the output. As a result, the method emphasized the recovery of perceptually important image features. We evaluated our strategy and compared it with alternative solutions by using a newly trained objective metric, a recent foveated video quality metric, and user experiments. Our evaluations revealed significant improvements in the perceived image reconstruction quality compared with the standard GAN-based training approach.</p>\",\"PeriodicalId\":50921,\"journal\":{\"name\":\"ACM Transactions on Applied Perception\",\"volume\":\"9 3\",\"pages\":\"\"},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2023-04-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Applied Perception\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/https://dl.acm.org/doi/10.1145/3583072\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Applied Perception","FirstCategoryId":"94","ListUrlMain":"https://doi.org/https://dl.acm.org/doi/10.1145/3583072","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

摘要

人眼视觉系统的视网膜灵敏度随偏心率的增大而迅速降低，根据视网膜灵敏度分布的稀疏样本集可以完全重构出注视点图像。生成对抗网络(GANs)的使用最近被证明是一个很有前途的解决方案，因为它们可以成功地产生缺失的图像信息。与其他监督学习方法一样，损失函数的定义和训练策略严重影响输出的质量。在这项工作中，我们考虑了有效指导注视点重建技术训练的问题，使他们更加了解人类视觉系统的能力和局限性，从而可以重建视觉上重要的图像特征。我们的主要目标是使训练过程对人类无法检测到的扭曲不那么敏感，并专注于惩罚感知上重要的工件。鉴于基于gan的解决方案的性质，我们关注的是在不同密度的输入样本情况下，人类视觉对幻觉的敏感性。我们提出了心理物理实验、数据集和一个训练注视点图像重建的程序。所提出的策略通过只惩罚输出中感知上重要的偏差，使发电机网络具有灵活性。因此，该方法强调恢复感知上重要的图像特征。我们评估了我们的策略，并通过使用新训练的客观度量、最近的焦点视频质量度量和用户实验，将其与替代解决方案进行了比较。我们的评估显示，与标准的基于gan的训练方法相比，感知图像重建质量有显著改善。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Learning GAN-Based Foveated Reconstruction to Recover Perceptually Important Image Features

A foveated image can be entirely reconstructed from a sparse set of samples distributed according to the retinal sensitivity of the human visual system, which rapidly decreases with increasing eccentricity. The use of generative adversarial networks (GANs) has recently been shown to be a promising solution for such a task, as they can successfully hallucinate missing image information. As in the case of other supervised learning approaches, the definition of the loss function and the training strategy heavily influence the quality of the output. In this work,we consider the problem of efficiently guiding the training of foveated reconstruction techniques such that they are more aware of the capabilities and limitations of the human visual system, and thus can reconstruct visually important image features. Our primary goal is to make the training procedure less sensitive to distortions that humans cannot detect and focus on penalizing perceptually important artifacts. Given the nature of GAN-based solutions, we focus on the sensitivity of human vision to hallucination in case of input samples with different densities. We propose psychophysical experiments, a dataset, and a procedure for training foveated image reconstruction. The proposed strategy renders the generator network flexible by penalizing only perceptually important deviations in the output. As a result, the method emphasized the recovery of perceptually important image features. We evaluated our strategy and compared it with alternative solutions by using a newly trained objective metric, a recent foveated video quality metric, and user experiments. Our evaluations revealed significant improvements in the perceived image reconstruction quality compared with the standard GAN-based training approach.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Transactions on Applied Perception 工程技术-计算机：软件工程

CiteScore

3.70

自引率

0.00%

发文量

审稿时长

12 months

期刊介绍： ACM Transactions on Applied Perception (TAP) aims to strengthen the synergy between computer science and psychology/perception by publishing top quality papers that help to unify research in these fields. The journal publishes inter-disciplinary research of significant and lasting value in any topic area that spans both Computer Science and Perceptual Psychology. All papers must incorporate both perceptual and computer science components.