基于时空动态和深度学习的跨模态识别的脑启发多感觉整合神经网络

IF 3.1 3区工程技术 Q2 NEUROSCIENCES

Cognitive Neurodynamics Pub Date : 2024-12-01 Epub Date: 2023-02-02 DOI:10.1007/s11571-023-09932-4

Haitao Yu, Quanfa Zhao

{"title":"基于时空动态和深度学习的跨模态识别的脑启发多感觉整合神经网络","authors":"Haitao Yu, Quanfa Zhao","doi":"10.1007/s11571-023-09932-4","DOIUrl":null,"url":null,"abstract":"The integration and interaction of cross-modal senses in brain neural networks can facilitate high-level cognitive functionalities. In this work, we proposed a bioinspired multisensory integration neural network (MINN) that integrates visual and audio senses for recognizing multimodal information across different sensory modalities. This deep learning-based model incorporates a cascading framework of parallel convolutional neural networks (CNNs) for extracting intrinsic features from visual and audio inputs, and a recurrent neural network (RNN) for multimodal information integration and interaction. The network was trained using synthetic training data generated for digital recognition tasks. It was revealed that the spatial and temporal features extracted from visual and audio inputs by CNNs were encoded in subspaces orthogonal with each other. In integration epoch, network state evolved along quasi-rotation-symmetric trajectories and a structural manifold with stable attractors was formed in RNN, supporting accurate cross-modal recognition. We further evaluated the robustness of the MINN algorithm with noisy inputs and asynchronous digital inputs. Experimental results demonstrated the superior performance of MINN for flexible integration and accurate recognition of multisensory information with distinct sense properties. The present results provide insights into the computational principles governing multisensory integration and a comprehensive neural network model for brain-inspired intelligence.","PeriodicalId":10500,"journal":{"name":"Cognitive Neurodynamics","volume":" ","pages":"3615-3628"},"PeriodicalIF":3.1000,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11655826/pdf/","citationCount":"0","resultStr":"{\"title\":\"Brain-inspired multisensory integration neural network for cross-modal recognition through spatiotemporal dynamics and deep learning.\",\"authors\":\"Haitao Yu, Quanfa Zhao\",\"doi\":\"10.1007/s11571-023-09932-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The integration and interaction of cross-modal senses in brain neural networks can facilitate high-level cognitive functionalities. In this work, we proposed a bioinspired multisensory integration neural network (MINN) that integrates visual and audio senses for recognizing multimodal information across different sensory modalities. This deep learning-based model incorporates a cascading framework of parallel convolutional neural networks (CNNs) for extracting intrinsic features from visual and audio inputs, and a recurrent neural network (RNN) for multimodal information integration and interaction. The network was trained using synthetic training data generated for digital recognition tasks. It was revealed that the spatial and temporal features extracted from visual and audio inputs by CNNs were encoded in subspaces orthogonal with each other. In integration epoch, network state evolved along quasi-rotation-symmetric trajectories and a structural manifold with stable attractors was formed in RNN, supporting accurate cross-modal recognition. We further evaluated the robustness of the MINN algorithm with noisy inputs and asynchronous digital inputs. Experimental results demonstrated the superior performance of MINN for flexible integration and accurate recognition of multisensory information with distinct sense properties. The present results provide insights into the computational principles governing multisensory integration and a comprehensive neural network model for brain-inspired intelligence.\",\"PeriodicalId\":10500,\"journal\":{\"name\":\"Cognitive Neurodynamics\",\"volume\":\" \",\"pages\":\"3615-3628\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11655826/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cognitive Neurodynamics\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1007/s11571-023-09932-4\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/2/2 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"NEUROSCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cognitive Neurodynamics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s11571-023-09932-4","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/2/2 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"NEUROSCIENCES","Score":null,"Total":0}

引用次数: 0

摘要

跨模态感觉在脑神经网络中的整合和相互作用可以促进高层次的认知功能。在这项工作中，我们提出了一个生物启发的多感觉整合神经网络（MINN），它集成了视觉和听觉感官，用于识别不同感觉模态的多模态信息。这种基于深度学习的模型结合了并行卷积神经网络（cnn）的级联框架，用于从视觉和音频输入中提取内在特征，以及用于多模态信息集成和交互的循环神经网络（RNN）。使用为数字识别任务生成的综合训练数据对网络进行训练。结果表明，cnn从视觉和音频输入中提取的时空特征被编码在彼此正交的子空间中。在积分阶段，网络状态沿准旋转对称轨迹演化，形成具有稳定吸引子的结构流形，支持准确的跨模态识别。我们进一步评估了带有噪声输入和异步数字输入的MINN算法的鲁棒性。实验结果表明，MINN在灵活整合和准确识别具有不同感官特性的多感官信息方面具有优异的性能。目前的结果提供了对控制多感觉整合的计算原理和脑启发智能的综合神经网络模型的见解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Brain-inspired multisensory integration neural network for cross-modal recognition through spatiotemporal dynamics and deep learning.

The integration and interaction of cross-modal senses in brain neural networks can facilitate high-level cognitive functionalities. In this work, we proposed a bioinspired multisensory integration neural network (MINN) that integrates visual and audio senses for recognizing multimodal information across different sensory modalities. This deep learning-based model incorporates a cascading framework of parallel convolutional neural networks (CNNs) for extracting intrinsic features from visual and audio inputs, and a recurrent neural network (RNN) for multimodal information integration and interaction. The network was trained using synthetic training data generated for digital recognition tasks. It was revealed that the spatial and temporal features extracted from visual and audio inputs by CNNs were encoded in subspaces orthogonal with each other. In integration epoch, network state evolved along quasi-rotation-symmetric trajectories and a structural manifold with stable attractors was formed in RNN, supporting accurate cross-modal recognition. We further evaluated the robustness of the MINN algorithm with noisy inputs and asynchronous digital inputs. Experimental results demonstrated the superior performance of MINN for flexible integration and accurate recognition of multisensory information with distinct sense properties. The present results provide insights into the computational principles governing multisensory integration and a comprehensive neural network model for brain-inspired intelligence.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Cognitive Neurodynamics 医学-神经科学

CiteScore

6.90

自引率

18.90%

发文量

140

审稿时长

12 months

期刊介绍： Cognitive Neurodynamics provides a unique forum of communication and cooperation for scientists and engineers working in the field of cognitive neurodynamics, intelligent science and applications, bridging the gap between theory and application, without any preference for pure theoretical, experimental or computational models. The emphasis is to publish original models of cognitive neurodynamics, novel computational theories and experimental results. In particular, intelligent science inspired by cognitive neuroscience and neurodynamics is also very welcome. The scope of Cognitive Neurodynamics covers cognitive neuroscience, neural computation based on dynamics, computer science, intelligent science as well as their interdisciplinary applications in the natural and engineering sciences. Papers that are appropriate for non-specialist readers are encouraged. 1. There is no page limit for manuscripts submitted to Cognitive Neurodynamics. Research papers should clearly represent an important advance of especially broad interest to researchers and technologists in neuroscience, biophysics, BCI, neural computer and intelligent robotics. 2. Cognitive Neurodynamics also welcomes brief communications: short papers reporting results that are of genuinely broad interest but that for one reason and another do not make a sufficiently complete story to justify a full article publication. Brief Communications should consist of approximately four manuscript pages. 3. Cognitive Neurodynamics publishes review articles in which a specific field is reviewed through an exhaustive literature survey. There are no restrictions on the number of pages. Review articles are usually invited, but submitted reviews will also be considered.