Evaluating Attention in Convolutional Neural Networks for Blended Images

2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS) Pub Date : 2022-12-05 DOI:10.1109/IPAS55744.2022.10052853

Andrea Portscher, Sebastian Stabinger, A. Rodríguez-Sánchez

{"title":"Evaluating Attention in Convolutional Neural Networks for Blended Images","authors":"Andrea Portscher, Sebastian Stabinger, A. Rodríguez-Sánchez","doi":"10.1109/IPAS55744.2022.10052853","DOIUrl":null,"url":null,"abstract":"In neuroscientific experiments, blended images are used to examine how attention mechanisms in the human brain work. They are particularly suited for this research area, as a subject needs to focus on particular features in an image to be able to classify superimposed objects. As Convolutional Neural Networks (CNNs) take some inspiration from the mammalian visual system – such as the hierarchical structure where different levels of abstraction are processed on different network layers – we examine how CNNs perform on this task. More specifically, we evaluate the performance of four popular CNN architectures (ResNet18, ResNet50, CORnet-Z, and Inception V3) on the classification of objects in blended images. Since humans can rather easily solve this task by applying object-based attention, we also augment all architectures with a multi-headed self-attention mechanism to examine its effect on performance. Lastly, we analyse if there is a correlation between the similarity of a network architecture's structure to the human visual system and its ability to correctly classify objects in blended images. Our findings showed that adding a self-attention mechanism reliably increases the similarity to the V4 area of the human ventral stream, an area where attention has a large influence on the processing of visual stimuli.","PeriodicalId":322228,"journal":{"name":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","volume":"Five 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPAS55744.2022.10052853","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In neuroscientific experiments, blended images are used to examine how attention mechanisms in the human brain work. They are particularly suited for this research area, as a subject needs to focus on particular features in an image to be able to classify superimposed objects. As Convolutional Neural Networks (CNNs) take some inspiration from the mammalian visual system – such as the hierarchical structure where different levels of abstraction are processed on different network layers – we examine how CNNs perform on this task. More specifically, we evaluate the performance of four popular CNN architectures (ResNet18, ResNet50, CORnet-Z, and Inception V3) on the classification of objects in blended images. Since humans can rather easily solve this task by applying object-based attention, we also augment all architectures with a multi-headed self-attention mechanism to examine its effect on performance. Lastly, we analyse if there is a correlation between the similarity of a network architecture's structure to the human visual system and its ability to correctly classify objects in blended images. Our findings showed that adding a self-attention mechanism reliably increases the similarity to the V4 area of the human ventral stream, an area where attention has a large influence on the processing of visual stimuli.

查看原文本刊更多论文

基于卷积神经网络的混合图像注意力评价

在神经科学实验中，混合图像被用来研究人类大脑的注意机制是如何工作的。它们特别适合这个研究领域，因为受试者需要关注图像中的特定特征，以便能够对重叠的物体进行分类。由于卷积神经网络(cnn)从哺乳动物的视觉系统中获得了一些灵感——比如在不同网络层上处理不同抽象层次的分层结构——我们研究了cnn在这项任务中的表现。更具体地说，我们评估了四种流行的CNN架构(ResNet18, ResNet50, CORnet-Z和Inception V3)在混合图像中对象分类方面的性能。由于人类可以很容易地通过应用基于对象的注意力来解决这个任务，我们还用多头自注意力机制来增强所有架构，以检查其对性能的影响。最后，我们分析了网络结构与人类视觉系统的相似度与其在混合图像中正确分类物体的能力之间是否存在相关性。我们的研究结果表明，增加自我注意机制可靠地增加了与人类腹侧流V4区域的相似性，该区域的注意力对视觉刺激的处理有很大的影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)

自引率

0.00%

发文量