Training and Predicting Visual Error for Real-Time Applications

IF 2.3 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Proceedings of the ACM on computer graphics and interactive techniques Pub Date : 2022-05-04 DOI:10.1145/3522625

Joao Liborio Cardoso, B. Kerbl, Lei Yang, Yury Uralsky, M. Wimmer

{"title":"Training and Predicting Visual Error for Real-Time Applications","authors":"Joao Liborio Cardoso, B. Kerbl, Lei Yang, Yury Uralsky, M. Wimmer","doi":"10.1145/3522625","DOIUrl":null,"url":null,"abstract":"Visual error metrics play a fundamental role in the quantification of perceived image similarity. Most recently, use cases for them in real-time applications have emerged, such as content-adaptive shading and shading reuse to increase performance and improve efficiency. A wide range of different metrics has been established, with the most sophisticated being capable of capturing the perceptual characteristics of the human visual system. However, their complexity, computational expense, and reliance on reference images to compare against prevent their generalized use in real-time, restricting such applications to using only the simplest available metrics. In this work, we explore the abilities of convolutional neural networks to predict a variety of visual metrics without requiring either reference or rendered images. Specifically, we train and deploy a neural network to estimate the visual error resulting from reusing shading or using reduced shading rates. The resulting models account for 70%-90% of the variance while achieving up to an order of magnitude faster computation times. Our solution combines image-space information that is readily available in most state-of-the-art deferred shading pipelines with reprojection from previous frames to enable an adequate estimate of visual errors, even in previously unseen regions. We describe a suitable convolutional network architecture and considerations for data preparation for training. We demonstrate the capability of our network to predict complex error metrics at interactive rates in a real-time application that implements content-adaptive shading in a deferred pipeline. Depending on the portion of unseen image regions, our approach can achieve up to 2x performance compared to state-of-the-art methods.","PeriodicalId":74536,"journal":{"name":"Proceedings of the ACM on computer graphics and interactive techniques","volume":" ","pages":"1 - 17"},"PeriodicalIF":2.3000,"publicationDate":"2022-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM on computer graphics and interactive techniques","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3522625","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 1

Abstract

Visual error metrics play a fundamental role in the quantification of perceived image similarity. Most recently, use cases for them in real-time applications have emerged, such as content-adaptive shading and shading reuse to increase performance and improve efficiency. A wide range of different metrics has been established, with the most sophisticated being capable of capturing the perceptual characteristics of the human visual system. However, their complexity, computational expense, and reliance on reference images to compare against prevent their generalized use in real-time, restricting such applications to using only the simplest available metrics. In this work, we explore the abilities of convolutional neural networks to predict a variety of visual metrics without requiring either reference or rendered images. Specifically, we train and deploy a neural network to estimate the visual error resulting from reusing shading or using reduced shading rates. The resulting models account for 70%-90% of the variance while achieving up to an order of magnitude faster computation times. Our solution combines image-space information that is readily available in most state-of-the-art deferred shading pipelines with reprojection from previous frames to enable an adequate estimate of visual errors, even in previously unseen regions. We describe a suitable convolutional network architecture and considerations for data preparation for training. We demonstrate the capability of our network to predict complex error metrics at interactive rates in a real-time application that implements content-adaptive shading in a deferred pipeline. Depending on the portion of unseen image regions, our approach can achieve up to 2x performance compared to state-of-the-art methods.

查看原文本刊更多论文

实时应用中视觉误差的训练与预测

视觉误差度量在感知图像相似性的量化中起着重要作用。最近，它们在实时应用程序中的用例已经出现，例如内容自适应着色和着色重用，以提高性能和效率。已经建立了各种不同的度量标准，其中最复杂的是能够捕捉人类视觉系统的感知特征。然而，它们的复杂性、计算费用和对参考图像的依赖阻碍了它们在实时中的广泛使用，限制了这些应用程序只能使用最简单的可用度量。在这项工作中，我们探索了卷积神经网络在不需要参考或渲染图像的情况下预测各种视觉指标的能力。具体来说，我们训练并部署了一个神经网络来估计由于重复使用阴影或使用降低的阴影率而导致的视觉误差。所得到的模型占方差的70%-90%，同时实现了高达一个数量级的计算时间。我们的解决方案结合了图像空间信息，这些信息在大多数最先进的延迟阴影管道中都很容易获得，并且从以前的帧中重新投影，从而能够充分估计视觉误差，即使是在以前看不见的区域。我们描述了一个合适的卷积网络架构和训练数据准备的注意事项。我们展示了我们的网络在延迟管道中实现内容自适应着色的实时应用程序中以交互速率预测复杂错误度量的能力。根据未见图像区域的部分，我们的方法可以实现比最先进的方法高2倍的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the ACM on computer graphics and interactive techniques

CiteScore

2.90

自引率

0.00%

发文量