Mouse vs. AI: A Neuroethological Benchmark for Visual Robustness and Neural Alignment.

ArXiv Pub Date : 2025-09-17

Marius Schneider, Joe Canzano, Jing Peng, Yuchen Hou, Spencer LaVere Smith, Michael Beyeler

{"title":"Mouse vs. AI: A Neuroethological Benchmark for Visual Robustness and Neural Alignment.","authors":"Marius Schneider, Joe Canzano, Jing Peng, Yuchen Hou, Spencer LaVere Smith, Michael Beyeler","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>Visual robustness under real-world conditions remains a critical bottleneck for modern reinforcement learning agents. In contrast, biological systems such as mice show remarkable resilience to environmental changes, maintaining stable performance even under degraded visual input with minimal exposure. Inspired by this gap, we propose the Mouse vs. AI: Robust Foraging Competition, a novel bioinspired visual robustness benchmark to test generalization in reinforcement learning (RL) agents trained to navigate a virtual environment toward a visually cued target. Participants train agents to perform a visually guided foraging task in a naturalistic 3D Unity environment and are evaluated on their ability to generalize to unseen, ecologically realistic visual perturbations. What sets this challenge apart is its biological grounding: real mice performed the same task, and participants receive both behavioral performance data and large-scale neural recordings (19,000+ neurons across visual cortex) for benchmarking. The competition features two tracks: (1) Visual Robustness, assessing generalization across held-out visual perturbations; and (2) Neural Alignment, evaluating how well agents' internal representations predict mouse visual cortical activity via a linear readout. We provide the full Unity environment, a fog-perturbed training condition for validation, baseline proximal policy optimization (PPO) agents, and a rich multimodal dataset. By bridging reinforcement learning, computer vision, and neuroscience through a shared, behaviorally grounded task, this challenge advances the development of robust, generalizable, and biologically inspired AI.</p>","PeriodicalId":93888,"journal":{"name":"ArXiv","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12458599/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ArXiv","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Visual robustness under real-world conditions remains a critical bottleneck for modern reinforcement learning agents. In contrast, biological systems such as mice show remarkable resilience to environmental changes, maintaining stable performance even under degraded visual input with minimal exposure. Inspired by this gap, we propose the Mouse vs. AI: Robust Foraging Competition, a novel bioinspired visual robustness benchmark to test generalization in reinforcement learning (RL) agents trained to navigate a virtual environment toward a visually cued target. Participants train agents to perform a visually guided foraging task in a naturalistic 3D Unity environment and are evaluated on their ability to generalize to unseen, ecologically realistic visual perturbations. What sets this challenge apart is its biological grounding: real mice performed the same task, and participants receive both behavioral performance data and large-scale neural recordings (19,000+ neurons across visual cortex) for benchmarking. The competition features two tracks: (1) Visual Robustness, assessing generalization across held-out visual perturbations; and (2) Neural Alignment, evaluating how well agents' internal representations predict mouse visual cortical activity via a linear readout. We provide the full Unity environment, a fog-perturbed training condition for validation, baseline proximal policy optimization (PPO) agents, and a rich multimodal dataset. By bridging reinforcement learning, computer vision, and neuroscience through a shared, behaviorally grounded task, this challenge advances the development of robust, generalizable, and biologically inspired AI.

Abstract Image

本刊更多论文

小鼠与人工智能：视觉稳健性和神经对齐的神经行为学基准。

现实世界条件下的视觉鲁棒性仍然是现代强化学习智能体的关键瓶颈。相比之下，生物系统，如小鼠，对环境变化表现出显著的恢复能力，即使在视觉输入退化的情况下，也能保持稳定的表现。受这一差距的启发，我们提出了小鼠与人工智能：稳健的觅食竞争，这是一种新的生物灵感视觉鲁棒性基准，用于测试强化学习（RL）代理的泛化训练，以引导虚拟环境走向视觉提示目标。参与者训练代理在自然的3D Unity环境中执行视觉引导的觅食任务，并评估其推广到看不见的、生态现实的视觉扰动的能力。这项挑战的独特之处在于它的生物学基础：真正的老鼠执行同样的任务，参与者接受行为表现数据和大规模神经记录（视觉皮层超过19,000个神经元）作为基准。比赛包括两个方面：(1)视觉稳健性，评估视觉扰动的泛化；(2)神经对齐，评估代理的内部表征如何通过线性读数预测小鼠视觉皮层活动。我们提供了完整的Unity环境，用于验证的雾扰动训练条件，基线近端策略优化（PPO）代理和丰富的多模态数据集。通过一个共享的、基于行为的任务，将强化学习、计算机视觉和神经科学联系起来，这一挑战推动了健壮的、可推广的、受生物学启发的人工智能的发展。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ArXiv

自引率

0.00%

发文量