On Human-like Biases in Convolutional Neural Networks for the Perception of Slant from Texture

IF 2.1 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

ACM Transactions on Applied Perception Pub Date : 2023-08-05 DOI:10.1145/3613451

Yuanhao Wang, Qian Zhang, Celine Aubuchon, Jovan T. Kemp, F. Domini, J. Tompkin

{"title":"On Human-like Biases in Convolutional Neural Networks for the Perception of Slant from Texture","authors":"Yuanhao Wang, Qian Zhang, Celine Aubuchon, Jovan T. Kemp, F. Domini, J. Tompkin","doi":"10.1145/3613451","DOIUrl":null,"url":null,"abstract":"Depth estimation is fundamental to 3D perception, and humans are known to have biased estimates of depth. This study investigates whether convolutional neural networks (CNNs) can be biased when predicting the sign of curvature and depth of surfaces of textured surfaces under different viewing conditions (field of view) and surface parameters (slant and texture irregularity). This hypothesis is drawn from the idea that texture gradients described by local neighborhoods—a cue identified in human vision literature—are also representable within convolutional neural networks. To this end, we trained both unsupervised and supervised CNN models on the renderings of slanted surfaces with random Polka dot patterns and analyzed their internal latent representations. The results show that the unsupervised models have similar prediction biases as humans across all experiments, while supervised CNN models do not exhibit similar biases. The latent spaces of the unsupervised models can be linearly separated into axes representing field of view and optical slant. For supervised models, this ability varies substantially with model architecture and the kind of supervision (continuous slant vs. sign of slant). Even though this study says nothing of any shared mechanism, these findings suggest that unsupervised CNN models can share similar predictions to the human visual system. Code: github.com/brownvc/Slant-CNN-Biases","PeriodicalId":50921,"journal":{"name":"ACM Transactions on Applied Perception","volume":" ","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2023-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Applied Perception","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3613451","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

Depth estimation is fundamental to 3D perception, and humans are known to have biased estimates of depth. This study investigates whether convolutional neural networks (CNNs) can be biased when predicting the sign of curvature and depth of surfaces of textured surfaces under different viewing conditions (field of view) and surface parameters (slant and texture irregularity). This hypothesis is drawn from the idea that texture gradients described by local neighborhoods—a cue identified in human vision literature—are also representable within convolutional neural networks. To this end, we trained both unsupervised and supervised CNN models on the renderings of slanted surfaces with random Polka dot patterns and analyzed their internal latent representations. The results show that the unsupervised models have similar prediction biases as humans across all experiments, while supervised CNN models do not exhibit similar biases. The latent spaces of the unsupervised models can be linearly separated into axes representing field of view and optical slant. For supervised models, this ability varies substantially with model architecture and the kind of supervision (continuous slant vs. sign of slant). Even though this study says nothing of any shared mechanism, these findings suggest that unsupervised CNN models can share similar predictions to the human visual system. Code: github.com/brownvc/Slant-CNN-Biases

查看原文本刊更多论文

关于卷积神经网络中从纹理感知倾斜的类人偏差

深度估计是3D感知的基础，已知人类对深度的估计有偏差。本研究调查了卷积神经网络（CNNs）在预测不同观看条件（视场）和表面参数（倾斜和纹理不规则）下纹理表面的曲率和深度符号时是否存在偏差。这一假设源于这样一种观点，即由局部邻域描述的纹理梯度——人类视觉文献中确定的线索——也可以在卷积神经网络中表示。为此，我们在具有随机波尔卡点图案的倾斜表面的渲染上训练了无监督和有监督的CNN模型，并分析了它们的内部潜在表示。结果表明，在所有实验中，无监督模型与人类具有相似的预测偏差，而有监督的CNN模型没有表现出相似的偏差。无监督模型的潜在空间可以线性地分为表示视场和光学倾斜的轴。对于监督模型，这种能力随着模型架构和监督类型（连续倾斜与倾斜符号）的不同而有很大差异。尽管这项研究没有说明任何共享机制，但这些发现表明，无监督的CNN模型可以与人类视觉系统共享类似的预测。代码：github.com/brownvc/Slant-CNN-Biases

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM Transactions on Applied Perception 工程技术-计算机：软件工程

CiteScore

3.70

自引率

0.00%

发文量

审稿时长

12 months

期刊介绍： ACM Transactions on Applied Perception (TAP) aims to strengthen the synergy between computer science and psychology/perception by publishing top quality papers that help to unify research in these fields. The journal publishes inter-disciplinary research of significant and lasting value in any topic area that spans both Computer Science and Perceptual Psychology. All papers must incorporate both perceptual and computer science components.