An action decoding framework combined with deep neural network for predicting the semantics of human actions in videos from evoked brain activities.

IF 2.5 4区医学 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Frontiers in Neuroinformatics Pub Date : 2025-02-19 eCollection Date: 2025-01-01 DOI:10.3389/fninf.2025.1526259

Yuanyuan Zhang, Manli Tian, Baolin Liu

{"title":"An action decoding framework combined with deep neural network for predicting the semantics of human actions in videos from evoked brain activities.","authors":"Yuanyuan Zhang, Manli Tian, Baolin Liu","doi":"10.3389/fninf.2025.1526259","DOIUrl":null,"url":null,"abstract":"Introduction: Recently, numerous studies have focused on the semantic decoding of perceived images based on functional magnetic resonance imaging (fMRI) activities. However, it remains unclear whether it is possible to establish relationships between brain activities and semantic features of human actions in video stimuli. Here we construct a framework for decoding action semantics by establishing relationships between brain activities and semantic features of human actions.Methods: To effectively use a small amount of available brain activity data, our proposed method employs a pre-trained image action recognition network model based on an expanding three-dimensional (X3D) deep neural network framework (DNN). To apply brain activities to the image action recognition network, we train regression models that learn the relationship between brain activities and deep-layer image features. To improve decoding accuracy, we join by adding the nonlocal-attention mechanism module to the X3D model to capture long-range temporal and spatial dependence, proposing a multilayer perceptron (MLP) module of multi-task loss constraint to build a more accurate regression mapping approach and performing data enhancement through linear interpolation to expand the amount of data to reduce the impact of a small sample.Results and discussion: Our findings indicate that the features in the X3D-DNN are biologically relevant, and capture information useful for perception. The proposed method enriches the semantic decoding model. We have also conducted several experiments with data from different subsets of brain regions known to process visual stimuli. The results suggest that semantic information for human actions is widespread across the entire visual cortex.","PeriodicalId":12462,"journal":{"name":"Frontiers in Neuroinformatics","volume":"19 ","pages":"1526259"},"PeriodicalIF":2.5000,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11880012/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Neuroinformatics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3389/fninf.2025.1526259","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Introduction: Recently, numerous studies have focused on the semantic decoding of perceived images based on functional magnetic resonance imaging (fMRI) activities. However, it remains unclear whether it is possible to establish relationships between brain activities and semantic features of human actions in video stimuli. Here we construct a framework for decoding action semantics by establishing relationships between brain activities and semantic features of human actions.

Methods: To effectively use a small amount of available brain activity data, our proposed method employs a pre-trained image action recognition network model based on an expanding three-dimensional (X3D) deep neural network framework (DNN). To apply brain activities to the image action recognition network, we train regression models that learn the relationship between brain activities and deep-layer image features. To improve decoding accuracy, we join by adding the nonlocal-attention mechanism module to the X3D model to capture long-range temporal and spatial dependence, proposing a multilayer perceptron (MLP) module of multi-task loss constraint to build a more accurate regression mapping approach and performing data enhancement through linear interpolation to expand the amount of data to reduce the impact of a small sample.

Results and discussion: Our findings indicate that the features in the X3D-DNN are biologically relevant, and capture information useful for perception. The proposed method enriches the semantic decoding model. We have also conducted several experiments with data from different subsets of brain regions known to process visual stimuli. The results suggest that semantic information for human actions is widespread across the entire visual cortex.

Abstract Image

查看原文本刊更多论文

结合深度神经网络的动作解码框架，从诱发的大脑活动中预测视频中人类动作的语义。

简介最近，许多研究都在关注基于功能磁共振成像（fMRI）活动的感知图像语义解码。然而，是否有可能在大脑活动与视频刺激中人类动作的语义特征之间建立关系，目前仍不清楚。在此，我们通过建立大脑活动与人类动作语义特征之间的关系，构建了一个解码动作语义的框架：为了有效利用少量可用的大脑活动数据，我们提出的方法采用了基于扩展三维（X3D）深度神经网络框架（DNN）的预训练图像动作识别网络模型。为了将脑部活动应用于图像动作识别网络，我们训练回归模型，学习脑部活动与深层图像特征之间的关系。为了提高解码准确性，我们在 X3D 模型中加入了非局部注意机制模块，以捕捉长程时空依赖性；提出了多任务损失约束的多层感知器（MLP）模块，以构建更精确的回归映射方法；并通过线性插值进行数据增强，以扩大数据量，减少小样本的影响：我们的研究结果表明，X3D-DNN 中的特征与生物相关，并捕获了对感知有用的信息。所提出的方法丰富了语义解码模型。我们还利用已知可处理视觉刺激的不同脑区子集的数据进行了多项实验。结果表明，人类行动的语义信息广泛存在于整个视觉皮层。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Frontiers in Neuroinformatics MATHEMATICAL & COMPUTATIONAL BIOLOGY-NEUROSCIENCES

CiteScore

4.80

自引率

5.70%

发文量

132

审稿时长

14 weeks

期刊介绍： Frontiers in Neuroinformatics publishes rigorously peer-reviewed research on the development and implementation of numerical/computational models and analytical tools used to share, integrate and analyze experimental data and advance theories of the nervous system functions. Specialty Chief Editors Jan G. Bjaalie at the University of Oslo and Sean L. Hill at the École Polytechnique Fédérale de Lausanne are supported by an outstanding Editorial Board of international experts. This multidisciplinary open-access journal is at the forefront of disseminating and communicating scientific knowledge and impactful discoveries to researchers, academics and the public worldwide. Neuroscience is being propelled into the information age as the volume of information explodes, demanding organization and synthesis. Novel synthesis approaches are opening up a new dimension for the exploration of the components of brain elements and systems and the vast number of variables that underlie their functions. Neural data is highly heterogeneous with complex inter-relations across multiple levels, driving the need for innovative organizing and synthesizing approaches from genes to cognition, and covering a range of species and disease states. Frontiers in Neuroinformatics therefore welcomes submissions on existing neuroscience databases, development of data and knowledge bases for all levels of neuroscience, applications and technologies that can facilitate data sharing (interoperability, formats, terminologies, and ontologies), and novel tools for data acquisition, analyses, visualization, and dissemination of nervous system data. Our journal welcomes submissions on new tools (software and hardware) that support brain modeling, and the merging of neuroscience databases with brain models used for simulation and visualization.