视觉神经网络学习什么？

IF 5.5 2区医学 Q1 NEUROSCIENCES

Annual Review of Vision Science Pub Date : 2025-09-01 Epub Date: 2025-07-29 DOI:10.1146/annurev-vision-110323-112903

Daniella Har-Shalom, Yair Weiss

{"title":"视觉神经网络学习什么？","authors":"Daniella Har-Shalom, Yair Weiss","doi":"10.1146/annurev-vision-110323-112903","DOIUrl":null,"url":null,"abstract":"Over the past decade, artificial neural networks trained to classify images downloaded from the internet have achieved astounding, almost superhuman performance and have been suggested as possible models for human vision. In this article, we review experimental evidence from multiple studies elucidating the classification strategy learned by successful visual neural networks (VNNs) and how this strategy may be related to human vision as well as previous approaches to computer vision. The studies we review evaluate the performance of VNNs on carefully designed tasks that are meant to tease out the cues they use. The use of this method shows that VNNs are often fooled by image changes to which human object recognition is largely invariant (e.g., the change of a few pixels in the image or a change of the background or illumination), and, conversely, that the networks can be invariant to very large image manipulations that disrupt human performance (e.g., randomly permuting the patches of an image). Taken together, the evidence suggests that these networks have learned relatively low-level cues that are extremely effective at classifying internet images but are ineffective at classifying many other images that humans can classify effortlessly.","PeriodicalId":48658,"journal":{"name":"Annual Review of Vision Science","volume":" ","pages":"591-610"},"PeriodicalIF":5.5000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"What Do Visual Neural Networks Learn?\",\"authors\":\"Daniella Har-Shalom, Yair Weiss\",\"doi\":\"10.1146/annurev-vision-110323-112903\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Over the past decade, artificial neural networks trained to classify images downloaded from the internet have achieved astounding, almost superhuman performance and have been suggested as possible models for human vision. In this article, we review experimental evidence from multiple studies elucidating the classification strategy learned by successful visual neural networks (VNNs) and how this strategy may be related to human vision as well as previous approaches to computer vision. The studies we review evaluate the performance of VNNs on carefully designed tasks that are meant to tease out the cues they use. The use of this method shows that VNNs are often fooled by image changes to which human object recognition is largely invariant (e.g., the change of a few pixels in the image or a change of the background or illumination), and, conversely, that the networks can be invariant to very large image manipulations that disrupt human performance (e.g., randomly permuting the patches of an image). Taken together, the evidence suggests that these networks have learned relatively low-level cues that are extremely effective at classifying internet images but are ineffective at classifying many other images that humans can classify effortlessly.\",\"PeriodicalId\":48658,\"journal\":{\"name\":\"Annual Review of Vision Science\",\"volume\":\" \",\"pages\":\"591-610\"},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2025-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Annual Review of Vision Science\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1146/annurev-vision-110323-112903\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/7/29 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"NEUROSCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annual Review of Vision Science","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1146/annurev-vision-110323-112903","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/7/29 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"NEUROSCIENCES","Score":null,"Total":0}

引用次数: 0

摘要

在过去的十年里，经过训练对从互联网下载的图像进行分类的人工神经网络取得了惊人的、近乎超人的表现，并被认为是人类视觉的可能模型。在本文中，我们回顾了来自多个研究的实验证据，这些研究阐明了成功的视觉神经网络（vnn）学习的分类策略，以及该策略如何与人类视觉以及先前的计算机视觉方法相关联。我们回顾的研究评估了vnn在精心设计的任务中的表现，这些任务旨在梳理它们使用的线索。这种方法的使用表明，vnn经常被图像变化所欺骗，而人类物体识别在很大程度上是不变的（例如，图像中几个像素的变化或背景或照明的变化），相反，网络可以对破坏人类表现的非常大的图像操作保持不变（例如，随机排列图像的补丁）。综上所示，这些证据表明，这些网络已经学会了相对低级的线索，这些线索在对互联网图像进行分类时非常有效，但在对许多其他人类可以毫不费力地进行分类的图像进行分类时却无效。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

What Do Visual Neural Networks Learn?

Over the past decade, artificial neural networks trained to classify images downloaded from the internet have achieved astounding, almost superhuman performance and have been suggested as possible models for human vision. In this article, we review experimental evidence from multiple studies elucidating the classification strategy learned by successful visual neural networks (VNNs) and how this strategy may be related to human vision as well as previous approaches to computer vision. The studies we review evaluate the performance of VNNs on carefully designed tasks that are meant to tease out the cues they use. The use of this method shows that VNNs are often fooled by image changes to which human object recognition is largely invariant (e.g., the change of a few pixels in the image or a change of the background or illumination), and, conversely, that the networks can be invariant to very large image manipulations that disrupt human performance (e.g., randomly permuting the patches of an image). Taken together, the evidence suggests that these networks have learned relatively low-level cues that are extremely effective at classifying internet images but are ineffective at classifying many other images that humans can classify effortlessly.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Annual Review of Vision Science Medicine-Ophthalmology

CiteScore

11.10

自引率

1.70%

发文量

期刊介绍： The Annual Review of Vision Science reviews progress in the visual sciences, a cross-cutting set of disciplines which intersect psychology, neuroscience, computer science, cell biology and genetics, and clinical medicine. The journal covers a broad range of topics and techniques, including optics, retina, central visual processing, visual perception, eye movements, visual development, vision models, computer vision, and the mechanisms of visual disease, dysfunction, and sight restoration. The study of vision is central to progress in many areas of science, and this new journal will explore and expose the connections that link it to biology, behavior, computation, engineering, and medicine.