NeuroFusionNet: cross-modal modeling from brain activity to visual understanding.

IF 2.3 4区医学 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Frontiers in Computational Neuroscience Pub Date : 2025-03-26 eCollection Date: 2025-01-01 DOI:10.3389/fncom.2025.1545971

Kehan Lang, Jianwei Fang, Guangyao Su

{"title":"NeuroFusionNet: cross-modal modeling from brain activity to visual understanding.","authors":"Kehan Lang, Jianwei Fang, Guangyao Su","doi":"10.3389/fncom.2025.1545971","DOIUrl":null,"url":null,"abstract":"<p><p>In recent years, the integration of machine vision and neuroscience has provided a new perspective for deeply understanding visual information. This paper proposes an innovative deep learning model, NeuroFusionNet, designed to enhance the understanding of visual information by integrating fMRI signals with image features. Specifically, images are processed by a visual model to extract region-of-interest (ROI) features and contextual information, which are then encoded through fully connected layers. The fMRI signals are passed through 1D convolutional layers to extract features, effectively preserving spatial information and improving computational efficiency. Subsequently, the fMRI features are embedded into a 3D voxel representation to capture the brain's activity patterns in both spatial and temporal dimensions. To accurately model the brain's response to visual stimuli, this paper introduces a Mutli-scale fMRI Timeformer module, which processes fMRI signals at different scales to extract both fine details and global responses. To further optimize the model's performance, we introduce a novel loss function called the fMRI-guided loss. Experimental results show that NeuroFusionNet effectively integrates image and brain activity information, providing more precise and richer visual representations for machine vision systems, with broad potential applications.</p>","PeriodicalId":12363,"journal":{"name":"Frontiers in Computational Neuroscience","volume":"19 ","pages":"1545971"},"PeriodicalIF":2.3000,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11978827/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Computational Neuroscience","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3389/fncom.2025.1545971","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

In recent years, the integration of machine vision and neuroscience has provided a new perspective for deeply understanding visual information. This paper proposes an innovative deep learning model, NeuroFusionNet, designed to enhance the understanding of visual information by integrating fMRI signals with image features. Specifically, images are processed by a visual model to extract region-of-interest (ROI) features and contextual information, which are then encoded through fully connected layers. The fMRI signals are passed through 1D convolutional layers to extract features, effectively preserving spatial information and improving computational efficiency. Subsequently, the fMRI features are embedded into a 3D voxel representation to capture the brain's activity patterns in both spatial and temporal dimensions. To accurately model the brain's response to visual stimuli, this paper introduces a Mutli-scale fMRI Timeformer module, which processes fMRI signals at different scales to extract both fine details and global responses. To further optimize the model's performance, we introduce a novel loss function called the fMRI-guided loss. Experimental results show that NeuroFusionNet effectively integrates image and brain activity information, providing more precise and richer visual representations for machine vision systems, with broad potential applications.

Abstract Image

查看原文本刊更多论文

NeuroFusionNet：从大脑活动到视觉理解的跨模态建模。

近年来，机器视觉与神经科学的融合为深入理解视觉信息提供了新的视角。本文提出了一种创新的深度学习模型NeuroFusionNet，旨在通过将fMRI信号与图像特征相结合来增强对视觉信息的理解。具体来说，通过视觉模型处理图像以提取感兴趣区域（ROI）特征和上下文信息，然后通过完全连接的层对其进行编码。fMRI信号通过一维卷积层进行特征提取，有效地保留了空间信息，提高了计算效率。随后，fMRI特征被嵌入到三维体素表示中，以捕捉大脑在空间和时间维度上的活动模式。为了准确地模拟大脑对视觉刺激的反应，本文引入了一个多尺度fMRI Timeformer模块，该模块处理不同尺度的fMRI信号，以提取精细细节和全局反应。为了进一步优化模型的性能，我们引入了一种新的损失函数，称为fmri引导损失。实验结果表明，NeuroFusionNet有效地整合了图像和大脑活动信息，为机器视觉系统提供了更精确、更丰富的视觉表征，具有广泛的应用潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Frontiers in Computational Neuroscience MATHEMATICAL & COMPUTATIONAL BIOLOGY-NEUROSCIENCES

CiteScore

5.30

自引率

3.10%

发文量

166

审稿时长

6-12 weeks

期刊介绍： Frontiers in Computational Neuroscience is a first-tier electronic journal devoted to promoting theoretical modeling of brain function and fostering interdisciplinary interactions between theoretical and experimental neuroscience. Progress in understanding the amazing capabilities of the brain is still limited, and we believe that it will only come with deep theoretical thinking and mutually stimulating cooperation between different disciplines and approaches. We therefore invite original contributions on a wide range of topics that present the fruits of such cooperation, or provide stimuli for future alliances. We aim to provide an interactive forum for cutting-edge theoretical studies of the nervous system, and for promulgating the best theoretical research to the broader neuroscience community. Models of all styles and at all levels are welcome, from biophysically motivated realistic simulations of neurons and synapses to high-level abstract models of inference and decision making. While the journal is primarily focused on theoretically based and driven research, we welcome experimental studies that validate and test theoretical conclusions. Also: comp neuro