Convolutional neural network models applied to neuronal responses in macaque V1 reveal limited nonlinear processing.

IF 2 4区 心理学 Q2 OPHTHALMOLOGY
Hui-Yuan Miao, Frank Tong
{"title":"Convolutional neural network models applied to neuronal responses in macaque V1 reveal limited nonlinear processing.","authors":"Hui-Yuan Miao, Frank Tong","doi":"10.1167/jov.24.6.1","DOIUrl":null,"url":null,"abstract":"<p><p>Computational models of the primary visual cortex (V1) have suggested that V1 neurons behave like Gabor filters followed by simple nonlinearities. However, recent work employing convolutional neural network (CNN) models has suggested that V1 relies on far more nonlinear computations than previously thought. Specifically, unit responses in an intermediate layer of VGG-19 were found to best predict macaque V1 responses to thousands of natural and synthetic images. Here, we evaluated the hypothesis that the poor performance of lower layer units in VGG-19 might be attributable to their small receptive field size rather than to their lack of complexity per se. We compared VGG-19 with AlexNet, which has much larger receptive fields in its lower layers. Whereas the best-performing layer of VGG-19 occurred after seven nonlinear steps, the first convolutional layer of AlexNet best predicted V1 responses. Although the predictive accuracy of VGG-19 was somewhat better than that of standard AlexNet, we found that a modified version of AlexNet could match the performance of VGG-19 after only a few nonlinear computations. Control analyses revealed that decreasing the size of the input images caused the best-performing layer of VGG-19 to shift to a lower layer, consistent with the hypothesis that the relationship between image size and receptive field size can strongly affect model performance. We conducted additional analyses using a Gabor pyramid model to test for nonlinear contributions of normalization and contrast saturation. Overall, our findings suggest that the feedforward responses of V1 neurons can be well explained by assuming only a few nonlinear processing stages.</p>","PeriodicalId":49955,"journal":{"name":"Journal of Vision","volume":"24 6","pages":"1"},"PeriodicalIF":2.0000,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11156204/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Vision","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1167/jov.24.6.1","RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Computational models of the primary visual cortex (V1) have suggested that V1 neurons behave like Gabor filters followed by simple nonlinearities. However, recent work employing convolutional neural network (CNN) models has suggested that V1 relies on far more nonlinear computations than previously thought. Specifically, unit responses in an intermediate layer of VGG-19 were found to best predict macaque V1 responses to thousands of natural and synthetic images. Here, we evaluated the hypothesis that the poor performance of lower layer units in VGG-19 might be attributable to their small receptive field size rather than to their lack of complexity per se. We compared VGG-19 with AlexNet, which has much larger receptive fields in its lower layers. Whereas the best-performing layer of VGG-19 occurred after seven nonlinear steps, the first convolutional layer of AlexNet best predicted V1 responses. Although the predictive accuracy of VGG-19 was somewhat better than that of standard AlexNet, we found that a modified version of AlexNet could match the performance of VGG-19 after only a few nonlinear computations. Control analyses revealed that decreasing the size of the input images caused the best-performing layer of VGG-19 to shift to a lower layer, consistent with the hypothesis that the relationship between image size and receptive field size can strongly affect model performance. We conducted additional analyses using a Gabor pyramid model to test for nonlinear contributions of normalization and contrast saturation. Overall, our findings suggest that the feedforward responses of V1 neurons can be well explained by assuming only a few nonlinear processing stages.

应用于猕猴 V1 神经元反应的卷积神经网络模型揭示了有限的非线性处理过程。
初级视觉皮层(V1)的计算模型表明,V1 神经元的行为类似于 Gabor 滤波器,其后是简单的非线性。然而,最近采用卷积神经网络(CNN)模型的研究表明,V1 依靠的非线性计算远比以前认为的要多。具体来说,研究发现 VGG-19 中间层的单元反应最能预测猕猴 V1 对数千张自然和合成图像的反应。在这里,我们评估了一个假设,即 VGG-19 中较低层单元的表现不佳可能是由于它们的感受野较小,而不是由于它们本身缺乏复杂性。我们将 VGG-19 与 AlexNet 进行了比较,后者的低层感受野要大得多。VGG-19 的最佳表现层出现在七个非线性步骤之后,而 AlexNet 的第一个卷积层对 V1 反应的预测效果最好。虽然 VGG-19 的预测准确率略高于标准 AlexNet,但我们发现,只需经过几次非线性计算,AlexNet 的改进版就能达到 VGG-19 的性能。控制分析表明,减小输入图像的大小会导致 VGG-19 中表现最好的层转移到较低的层,这与图像大小和感受野大小之间的关系会强烈影响模型性能的假设一致。我们还使用 Gabor 金字塔模型进行了其他分析,以检验归一化和对比度饱和度的非线性贡献。总之,我们的研究结果表明,只需假设几个非线性处理阶段,就能很好地解释 V1 神经元的前馈反应。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Vision
Journal of Vision 医学-眼科学
CiteScore
2.90
自引率
5.60%
发文量
218
审稿时长
3-6 weeks
期刊介绍: Exploring all aspects of biological visual function, including spatial vision, perception, low vision, color vision and more, spanning the fields of neuroscience, psychology and psychophysics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信