Metric invariance in object recognition: a review and further evidence.

E E Cooper, I Biederman, J E Hummel
{"title":"Metric invariance in object recognition: a review and further evidence.","authors":"E E Cooper,&nbsp;I Biederman,&nbsp;J E Hummel","doi":"10.1037/h0084317","DOIUrl":null,"url":null,"abstract":"<p><p>Phenomenologically, human shape recognition appears to be invariant with changes of orientation in depth (up to parts occlusion), position in the visual field, and size. Recent versions of template theories (e.g., Ullman, 1989; Lowe, 1987) assume that these invariances are achieved through the application of transformations such as rotation, translation, and scaling of the image so that it can be matched metrically to a stored template. Presumably, such transformations would require time for their execution. We describe recent priming experiments in which the effects of a prior brief presentation of an image on its subsequent recognition are assessed. The results of these experiments indicate that the invariance is complete: The magnitude of visual priming (as distinct from name or basic level concept priming) is not affected by a change in position, size, orientation in depth, or the particular lines and vertices present in the image, as long as representations of the same components can be activated. An implemented seven layer neural network model (Hummel & Biederman, 1992) that captures these fundamental properties of human object recognition is described. Given a line drawing of an object, the model activates a viewpoint-invariant structural description of the object, specifying its parts and their interrelations. Visual priming is interpreted as a change in the connection weights for the activation of: a) cells, termed geon feature assemblies (GFAs), that conjoin the output of units that represent invariant, independent properties of a single geon and its relations (such as its type, aspect ratio, relations to other geons), or b) a change in the connection weights by which several GFAs activate a cell representing an object.</p>","PeriodicalId":75671,"journal":{"name":"Canadian journal of psychology","volume":"46 2","pages":"191-214"},"PeriodicalIF":0.0000,"publicationDate":"1992-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1037/h0084317","citationCount":"117","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Canadian journal of psychology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1037/h0084317","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 117

Abstract

Phenomenologically, human shape recognition appears to be invariant with changes of orientation in depth (up to parts occlusion), position in the visual field, and size. Recent versions of template theories (e.g., Ullman, 1989; Lowe, 1987) assume that these invariances are achieved through the application of transformations such as rotation, translation, and scaling of the image so that it can be matched metrically to a stored template. Presumably, such transformations would require time for their execution. We describe recent priming experiments in which the effects of a prior brief presentation of an image on its subsequent recognition are assessed. The results of these experiments indicate that the invariance is complete: The magnitude of visual priming (as distinct from name or basic level concept priming) is not affected by a change in position, size, orientation in depth, or the particular lines and vertices present in the image, as long as representations of the same components can be activated. An implemented seven layer neural network model (Hummel & Biederman, 1992) that captures these fundamental properties of human object recognition is described. Given a line drawing of an object, the model activates a viewpoint-invariant structural description of the object, specifying its parts and their interrelations. Visual priming is interpreted as a change in the connection weights for the activation of: a) cells, termed geon feature assemblies (GFAs), that conjoin the output of units that represent invariant, independent properties of a single geon and its relations (such as its type, aspect ratio, relations to other geons), or b) a change in the connection weights by which several GFAs activate a cell representing an object.

物体识别中的度量不变性:回顾和进一步的证据。
从现象学的角度来看,人类的形状识别似乎是不变的,随着深度方向的变化(直到部分遮挡),在视野中的位置和大小的变化。模板理论的最新版本(例如,Ullman, 1989;Lowe, 1987)假设这些不变性是通过应用图像的旋转、平移和缩放等变换来实现的,这样它就可以与存储的模板进行度量匹配。大概,这样的转换需要时间来执行。我们描述了最近的启动实验,其中对图像的先前简短呈现对其后续识别的影响进行了评估。这些实验的结果表明,不变性是完全的:视觉启动的大小(不同于名称或基本级别的概念启动)不受位置、大小、深度方向或图像中存在的特定直线和顶点的变化的影响,只要相同组件的表示可以被激活。描述了一个实现的七层神经网络模型(Hummel & Biederman, 1992),该模型捕获了人类物体识别的这些基本属性。给定对象的线条图,模型激活对象的视点不变结构描述,指定其部分及其相互关系。视觉启动被解释为激活的连接权重的变化:a)被称为geon feature assemblies (gfa)的单元,它将代表单个geon及其关系(如其类型,长宽比,与其他geon的关系)的不变,独立属性的单元的输出连接在一起,或者b)连接权重的变化,通过几个gfa激活代表对象的cell。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信