对象识别中的上下文反馈：生物学启发的计算模型和人类行为研究

IF 1.4 4区心理学 Q4 NEUROSCIENCES

Vision Research Pub Date : 2025-08-30 DOI:10.1016/j.visres.2025.108679

Elahe Soltandoost , Karim Rajaei , Reza Ebrahimpour

{"title":"对象识别中的上下文反馈：生物学启发的计算模型和人类行为研究","authors":"Elahe Soltandoost , Karim Rajaei , Reza Ebrahimpour","doi":"10.1016/j.visres.2025.108679","DOIUrl":null,"url":null,"abstract":"<div><div>Scene context is known to significantly influence visual perception, enhancing object recognition particularly under challenging viewing conditions. Behavioral and neuroimaging studies suggest that high-level scene information modulates activity in object-selective brain areas through top-down mechanisms, yet the underlying mechanism of this process remains unclear. Here, we introduce a biologically inspired context-based computational model (CBM) that integrates scene context into object recognition via an explicit feedback mechanism. CBM consists of two distinct pathways: Object_CNN, which processes localized object features, and Place_CNN, which extracts global scene information to modulate object processing. We compare CBM to a standard feedforward model, AlexNet, in a multiclass object recognition task under varying levels of visual degradation and occlusion. CBM significantly outperformed a standard feedforward model (AlexNet), demonstrating the effectiveness of structured contextual feedback in resolving ambiguous or degraded visual input. However, behavioral experiments revealed that while humans also benefited from congruent context — particularly at high occlusion levels — the effect was modest. Human recognition remained relatively robust even without contextual support, suggesting that mechanisms such as global shape processing and pattern completion, likely mediated by local recurrent processes, play a dominant role in resolving occluded input. These findings highlight the potential of contextual feedback for enhancing model performance, while also underscoring key differences between human and models. Our results point toward the need for models that combine context-sensitive feedback with object-intrinsic local recurrent processes to more closely approximate the flexible and resilient strategies of human perception.</div></div>","PeriodicalId":23670,"journal":{"name":"Vision Research","volume":"237 ","pages":"Article 108679"},"PeriodicalIF":1.4000,"publicationDate":"2025-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Contextual feedback in object recognition: A biologically inspired computational model and human behavioral study\",\"authors\":\"Elahe Soltandoost , Karim Rajaei , Reza Ebrahimpour\",\"doi\":\"10.1016/j.visres.2025.108679\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Scene context is known to significantly influence visual perception, enhancing object recognition particularly under challenging viewing conditions. Behavioral and neuroimaging studies suggest that high-level scene information modulates activity in object-selective brain areas through top-down mechanisms, yet the underlying mechanism of this process remains unclear. Here, we introduce a biologically inspired context-based computational model (CBM) that integrates scene context into object recognition via an explicit feedback mechanism. CBM consists of two distinct pathways: Object_CNN, which processes localized object features, and Place_CNN, which extracts global scene information to modulate object processing. We compare CBM to a standard feedforward model, AlexNet, in a multiclass object recognition task under varying levels of visual degradation and occlusion. CBM significantly outperformed a standard feedforward model (AlexNet), demonstrating the effectiveness of structured contextual feedback in resolving ambiguous or degraded visual input. However, behavioral experiments revealed that while humans also benefited from congruent context — particularly at high occlusion levels — the effect was modest. Human recognition remained relatively robust even without contextual support, suggesting that mechanisms such as global shape processing and pattern completion, likely mediated by local recurrent processes, play a dominant role in resolving occluded input. These findings highlight the potential of contextual feedback for enhancing model performance, while also underscoring key differences between human and models. Our results point toward the need for models that combine context-sensitive feedback with object-intrinsic local recurrent processes to more closely approximate the flexible and resilient strategies of human perception.</div></div>\",\"PeriodicalId\":23670,\"journal\":{\"name\":\"Vision Research\",\"volume\":\"237 \",\"pages\":\"Article 108679\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2025-08-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Vision Research\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0042698925001403\",\"RegionNum\":4,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"NEUROSCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Vision Research","FirstCategoryId":"102","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0042698925001403","RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"NEUROSCIENCES","Score":null,"Total":0}

引用次数: 0

摘要

众所周知，场景环境会显著影响视觉感知，增强物体识别，特别是在具有挑战性的观看条件下。行为和神经影像学研究表明，高水平的场景信息通过自上而下的机制调节了目标选择脑区的活动，但这一过程的潜在机制尚不清楚。在这里，我们引入了一种受生物学启发的基于上下文的计算模型（CBM），该模型通过明确的反馈机制将场景上下文集成到对象识别中。CBM包括两个不同的路径：Object_CNN和Place_CNN，前者处理局部目标特征，后者提取全局场景信息以调制目标处理。我们将CBM与标准前馈模型AlexNet在不同程度的视觉退化和遮挡下的多类物体识别任务中进行比较。CBM显著优于标准前馈模型（AlexNet），证明了结构化上下文反馈在解决模糊或退化的视觉输入方面的有效性。然而，行为实验显示，虽然人类也从一致的语境中受益，尤其是在高闭塞水平下，但效果是有限的。即使没有上下文支持，人类识别仍然相对强大，这表明可能由局部循环过程介导的全局形状处理和模式完成等机制在解决闭塞输入中起主导作用。这些发现强调了上下文反馈在提高模型性能方面的潜力，同时也强调了人类和模型之间的关键差异。我们的研究结果表明，需要将上下文敏感反馈与对象固有的局部循环过程相结合的模型，以更接近人类感知的灵活和弹性策略。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Contextual feedback in object recognition: A biologically inspired computational model and human behavioral study

Scene context is known to significantly influence visual perception, enhancing object recognition particularly under challenging viewing conditions. Behavioral and neuroimaging studies suggest that high-level scene information modulates activity in object-selective brain areas through top-down mechanisms, yet the underlying mechanism of this process remains unclear. Here, we introduce a biologically inspired context-based computational model (CBM) that integrates scene context into object recognition via an explicit feedback mechanism. CBM consists of two distinct pathways: Object_CNN, which processes localized object features, and Place_CNN, which extracts global scene information to modulate object processing. We compare CBM to a standard feedforward model, AlexNet, in a multiclass object recognition task under varying levels of visual degradation and occlusion. CBM significantly outperformed a standard feedforward model (AlexNet), demonstrating the effectiveness of structured contextual feedback in resolving ambiguous or degraded visual input. However, behavioral experiments revealed that while humans also benefited from congruent context — particularly at high occlusion levels — the effect was modest. Human recognition remained relatively robust even without contextual support, suggesting that mechanisms such as global shape processing and pattern completion, likely mediated by local recurrent processes, play a dominant role in resolving occluded input. These findings highlight the potential of contextual feedback for enhancing model performance, while also underscoring key differences between human and models. Our results point toward the need for models that combine context-sensitive feedback with object-intrinsic local recurrent processes to more closely approximate the flexible and resilient strategies of human perception.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Vision Research 医学-神经科学

CiteScore

3.70

自引率

16.70%

发文量

111

审稿时长

66 days

期刊介绍： Vision Research is a journal devoted to the functional aspects of human, vertebrate and invertebrate vision and publishes experimental and observational studies, reviews, and theoretical and computational analyses. Vision Research also publishes clinical studies relevant to normal visual function and basic research relevant to visual dysfunction or its clinical investigation. Functional aspects of vision is interpreted broadly, ranging from molecular and cellular function to perception and behavior. Detailed descriptions are encouraged but enough introductory background should be included for non-specialists. Theoretical and computational papers should give a sense of order to the facts or point to new verifiable observations. Papers dealing with questions in the history of vision science should stress the development of ideas in the field.