应用于猕猴 V1 神经元反应的卷积神经网络模型揭示了有限的非线性处理过程。

IF 2 4区心理学 Q2 OPHTHALMOLOGY

Journal of Vision Pub Date : 2024-06-03 DOI:10.1167/jov.24.6.1

Hui-Yuan Miao, Frank Tong

{"title":"应用于猕猴 V1 神经元反应的卷积神经网络模型揭示了有限的非线性处理过程。","authors":"Hui-Yuan Miao, Frank Tong","doi":"10.1167/jov.24.6.1","DOIUrl":null,"url":null,"abstract":"Computational models of the primary visual cortex (V1) have suggested that V1 neurons behave like Gabor filters followed by simple nonlinearities. However, recent work employing convolutional neural network (CNN) models has suggested that V1 relies on far more nonlinear computations than previously thought. Specifically, unit responses in an intermediate layer of VGG-19 were found to best predict macaque V1 responses to thousands of natural and synthetic images. Here, we evaluated the hypothesis that the poor performance of lower layer units in VGG-19 might be attributable to their small receptive field size rather than to their lack of complexity per se. We compared VGG-19 with AlexNet, which has much larger receptive fields in its lower layers. Whereas the best-performing layer of VGG-19 occurred after seven nonlinear steps, the first convolutional layer of AlexNet best predicted V1 responses. Although the predictive accuracy of VGG-19 was somewhat better than that of standard AlexNet, we found that a modified version of AlexNet could match the performance of VGG-19 after only a few nonlinear computations. Control analyses revealed that decreasing the size of the input images caused the best-performing layer of VGG-19 to shift to a lower layer, consistent with the hypothesis that the relationship between image size and receptive field size can strongly affect model performance. We conducted additional analyses using a Gabor pyramid model to test for nonlinear contributions of normalization and contrast saturation. Overall, our findings suggest that the feedforward responses of V1 neurons can be well explained by assuming only a few nonlinear processing stages.","PeriodicalId":49955,"journal":{"name":"Journal of Vision","volume":"24 6","pages":"1"},"PeriodicalIF":2.0000,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11156204/pdf/","citationCount":"0","resultStr":"{\"title\":\"Convolutional neural network models applied to neuronal responses in macaque V1 reveal limited nonlinear processing.\",\"authors\":\"Hui-Yuan Miao, Frank Tong\",\"doi\":\"10.1167/jov.24.6.1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Computational models of the primary visual cortex (V1) have suggested that V1 neurons behave like Gabor filters followed by simple nonlinearities. However, recent work employing convolutional neural network (CNN) models has suggested that V1 relies on far more nonlinear computations than previously thought. Specifically, unit responses in an intermediate layer of VGG-19 were found to best predict macaque V1 responses to thousands of natural and synthetic images. Here, we evaluated the hypothesis that the poor performance of lower layer units in VGG-19 might be attributable to their small receptive field size rather than to their lack of complexity per se. We compared VGG-19 with AlexNet, which has much larger receptive fields in its lower layers. Whereas the best-performing layer of VGG-19 occurred after seven nonlinear steps, the first convolutional layer of AlexNet best predicted V1 responses. Although the predictive accuracy of VGG-19 was somewhat better than that of standard AlexNet, we found that a modified version of AlexNet could match the performance of VGG-19 after only a few nonlinear computations. Control analyses revealed that decreasing the size of the input images caused the best-performing layer of VGG-19 to shift to a lower layer, consistent with the hypothesis that the relationship between image size and receptive field size can strongly affect model performance. We conducted additional analyses using a Gabor pyramid model to test for nonlinear contributions of normalization and contrast saturation. Overall, our findings suggest that the feedforward responses of V1 neurons can be well explained by assuming only a few nonlinear processing stages.\",\"PeriodicalId\":49955,\"journal\":{\"name\":\"Journal of Vision\",\"volume\":\"24 6\",\"pages\":\"1\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2024-06-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11156204/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Vision\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1167/jov.24.6.1\",\"RegionNum\":4,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"OPHTHALMOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Vision","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1167/jov.24.6.1","RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

初级视觉皮层（V1）的计算模型表明，V1 神经元的行为类似于 Gabor 滤波器，其后是简单的非线性。然而，最近采用卷积神经网络（CNN）模型的研究表明，V1 依靠的非线性计算远比以前认为的要多。具体来说，研究发现 VGG-19 中间层的单元反应最能预测猕猴 V1 对数千张自然和合成图像的反应。在这里，我们评估了一个假设，即 VGG-19 中较低层单元的表现不佳可能是由于它们的感受野较小，而不是由于它们本身缺乏复杂性。我们将 VGG-19 与 AlexNet 进行了比较，后者的低层感受野要大得多。VGG-19 的最佳表现层出现在七个非线性步骤之后，而 AlexNet 的第一个卷积层对 V1 反应的预测效果最好。虽然 VGG-19 的预测准确率略高于标准 AlexNet，但我们发现，只需经过几次非线性计算，AlexNet 的改进版就能达到 VGG-19 的性能。控制分析表明，减小输入图像的大小会导致 VGG-19 中表现最好的层转移到较低的层，这与图像大小和感受野大小之间的关系会强烈影响模型性能的假设一致。我们还使用 Gabor 金字塔模型进行了其他分析，以检验归一化和对比度饱和度的非线性贡献。总之，我们的研究结果表明，只需假设几个非线性处理阶段，就能很好地解释 V1 神经元的前馈反应。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Convolutional neural network models applied to neuronal responses in macaque V1 reveal limited nonlinear processing.

Computational models of the primary visual cortex (V1) have suggested that V1 neurons behave like Gabor filters followed by simple nonlinearities. However, recent work employing convolutional neural network (CNN) models has suggested that V1 relies on far more nonlinear computations than previously thought. Specifically, unit responses in an intermediate layer of VGG-19 were found to best predict macaque V1 responses to thousands of natural and synthetic images. Here, we evaluated the hypothesis that the poor performance of lower layer units in VGG-19 might be attributable to their small receptive field size rather than to their lack of complexity per se. We compared VGG-19 with AlexNet, which has much larger receptive fields in its lower layers. Whereas the best-performing layer of VGG-19 occurred after seven nonlinear steps, the first convolutional layer of AlexNet best predicted V1 responses. Although the predictive accuracy of VGG-19 was somewhat better than that of standard AlexNet, we found that a modified version of AlexNet could match the performance of VGG-19 after only a few nonlinear computations. Control analyses revealed that decreasing the size of the input images caused the best-performing layer of VGG-19 to shift to a lower layer, consistent with the hypothesis that the relationship between image size and receptive field size can strongly affect model performance. We conducted additional analyses using a Gabor pyramid model to test for nonlinear contributions of normalization and contrast saturation. Overall, our findings suggest that the feedforward responses of V1 neurons can be well explained by assuming only a few nonlinear processing stages.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Vision 医学-眼科学

CiteScore

2.90

自引率

5.60%

发文量

218

审稿时长

3-6 weeks

期刊介绍： Exploring all aspects of biological visual function, including spatial vision, perception, low vision, color vision and more, spanning the fields of neuroscience, psychology and psychophysics.