Virginia E Strehle, Natalie K Bendiksen, Alice J O'Toole
{"title":"Deep convolutional neural networks are sensitive to face configuration.","authors":"Virginia E Strehle, Natalie K Bendiksen, Alice J O'Toole","doi":"10.1167/jov.24.12.6","DOIUrl":null,"url":null,"abstract":"<p><p>Deep convolutional neural networks (DCNNs) are remarkably accurate models of human face recognition. However, less is known about whether these models generate face representations similar to those used by humans. Sensitivity to facial configuration has long been considered a marker of human perceptual expertise for faces. We tested whether DCNNs trained for face identification \"perceive\" alterations to facial features and their configuration. We also compared the extent to which representations changed as a function of the alteration type. Facial configuration was altered by changing the distance between the eyes or the distance between the nose and mouth. Facial features were altered by replacing the eyes or mouth with those of another face. Altered faces were processed by DCNNs (Ranjan et al., 2018; Szegedy et al., 2017) and the similarity of the generated representations was compared. Both DCNNs were sensitive to configural and feature changes-with changes to configuration altering the DCNN representations more than changes to face features. To determine whether the DCNNs' greater sensitivity to configuration was due to a priori differences in the images or characteristics of the DCNN processing, we compared the representation of features and configuration between the low-level, pixel-based representations and the DCNN-generated representations. Sensitivity to face configuration increased from the pixel-level image to the DCNN encoding, whereas the sensitivity to features did not change. The enhancement of configural information may be due to the utility of configuration for discriminating among similar faces combined with the within-category nature of face identification training.</p>","PeriodicalId":49955,"journal":{"name":"Journal of Vision","volume":"24 12","pages":"6"},"PeriodicalIF":2.0000,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11542502/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Vision","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1167/jov.24.12.6","RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Deep convolutional neural networks (DCNNs) are remarkably accurate models of human face recognition. However, less is known about whether these models generate face representations similar to those used by humans. Sensitivity to facial configuration has long been considered a marker of human perceptual expertise for faces. We tested whether DCNNs trained for face identification "perceive" alterations to facial features and their configuration. We also compared the extent to which representations changed as a function of the alteration type. Facial configuration was altered by changing the distance between the eyes or the distance between the nose and mouth. Facial features were altered by replacing the eyes or mouth with those of another face. Altered faces were processed by DCNNs (Ranjan et al., 2018; Szegedy et al., 2017) and the similarity of the generated representations was compared. Both DCNNs were sensitive to configural and feature changes-with changes to configuration altering the DCNN representations more than changes to face features. To determine whether the DCNNs' greater sensitivity to configuration was due to a priori differences in the images or characteristics of the DCNN processing, we compared the representation of features and configuration between the low-level, pixel-based representations and the DCNN-generated representations. Sensitivity to face configuration increased from the pixel-level image to the DCNN encoding, whereas the sensitivity to features did not change. The enhancement of configural information may be due to the utility of configuration for discriminating among similar faces combined with the within-category nature of face identification training.
期刊介绍:
Exploring all aspects of biological visual function, including spatial vision, perception,
low vision, color vision and more, spanning the fields of neuroscience, psychology and psychophysics.