A Prompt-Guided Generative Language Model for Unifying Visual Neural Decoding Across Multiple Subjects and Tasks.

IF 6.4

International journal of neural systems Pub Date : 2025-09-26 DOI:10.1142/S0129065725500686

Wei Huang, Hengjiang Li, Fan Qin, Diwei Wu, Kaiwen Cheng, Huafu Chen

{"title":"A Prompt-Guided Generative Language Model for Unifying Visual Neural Decoding Across Multiple Subjects and Tasks.","authors":"Wei Huang, Hengjiang Li, Fan Qin, Diwei Wu, Kaiwen Cheng, Huafu Chen","doi":"10.1142/S0129065725500686","DOIUrl":null,"url":null,"abstract":"<p><p>Visual neural decoding not only aids in elucidating the neural mechanisms underlying the processing of visual information but also facilitates the advancement of brain-computer interface technologies. However, most current decoding studies focus on developing separate decoding models for individual subjects and specific tasks, an approach that escalates training costs and consumes a substantial amount of computational resources. This paper introduces a Prompt-Guided Generative Visual Language Decoding Model (PG-GVLDM), which uses prompt text that includes information about subjects and tasks to decode both primary categories and detailed textual descriptions from the visual response activities of multiple individuals. In addition to visual response activities, this study also incorporates a multi-head cross-attention module and feeds the model with whole-brain response activities to capture global semantic information in the brain. Experiments on the Natural Scenes Dataset (NSD) demonstrate that PG-GVLDM attains an average category decoding accuracy of 66.6% across four subjects, reflecting strong cross-subject generalization, and achieves text decoding scores of 0.342 (METEOR), 0.450 (Sentence-Transformer), 0.283 (ROUGE-1), and 0.262 (ROUGE-L), establishing state-of-the-art performance in text decoding. Furthermore, incorporating whole-brain response activities significantly enhances decoding performance by enabling the integration of distributed neural signals into coherent global semantic representations, underscoring its methodological importance for unified neural decoding. This research not only represents a breakthrough in visual neural decoding methodologies but also provides theoretical and technical support for the development of generalized brain-computer interfaces.</p>","PeriodicalId":94052,"journal":{"name":"International journal of neural systems","volume":" ","pages":"2550068"},"PeriodicalIF":6.4000,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of neural systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/S0129065725500686","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Visual neural decoding not only aids in elucidating the neural mechanisms underlying the processing of visual information but also facilitates the advancement of brain-computer interface technologies. However, most current decoding studies focus on developing separate decoding models for individual subjects and specific tasks, an approach that escalates training costs and consumes a substantial amount of computational resources. This paper introduces a Prompt-Guided Generative Visual Language Decoding Model (PG-GVLDM), which uses prompt text that includes information about subjects and tasks to decode both primary categories and detailed textual descriptions from the visual response activities of multiple individuals. In addition to visual response activities, this study also incorporates a multi-head cross-attention module and feeds the model with whole-brain response activities to capture global semantic information in the brain. Experiments on the Natural Scenes Dataset (NSD) demonstrate that PG-GVLDM attains an average category decoding accuracy of 66.6% across four subjects, reflecting strong cross-subject generalization, and achieves text decoding scores of 0.342 (METEOR), 0.450 (Sentence-Transformer), 0.283 (ROUGE-1), and 0.262 (ROUGE-L), establishing state-of-the-art performance in text decoding. Furthermore, incorporating whole-brain response activities significantly enhances decoding performance by enabling the integration of distributed neural signals into coherent global semantic representations, underscoring its methodological importance for unified neural decoding. This research not only represents a breakthrough in visual neural decoding methodologies but also provides theoretical and technical support for the development of generalized brain-computer interfaces.

查看原文本刊更多论文

一种跨主题和任务统一视觉神经解码的提示引导生成语言模型。

视觉神经解码不仅有助于阐明视觉信息处理背后的神经机制，而且促进了脑机接口技术的发展。然而，目前大多数解码研究都侧重于为个体受试者和特定任务开发单独的解码模型，这种方法增加了培训成本并消耗了大量的计算资源。本文介绍了一种提示引导生成式视觉语言解码模型（PG-GVLDM），该模型利用包含主题和任务信息的提示文本对多个个体的视觉反应活动的主要类别和详细文本描述进行解码。除了视觉反应活动外，本研究还加入了一个多头交叉注意模块，并为模型提供全脑反应活动，以捕获大脑中的全局语义信息。在自然场景数据集（NSD）上的实验表明，PG-GVLDM在4个主题上的平均类别解码准确率达到66.6%，体现了较强的跨主题泛化，文本解码得分分别为0.342 （METEOR）、0.450（句子变压器）、0.283 （ROUGE-1）和0.262 (ROUGE-L)，在文本解码方面具有较好的性能。此外，通过将分布式神经信号整合到连贯的全局语义表示中，整合全脑反应活动显著提高了解码性能，强调了其在统一神经解码方法上的重要性。该研究不仅代表了视觉神经解码方法的突破，而且为广义脑机接口的发展提供了理论和技术支持。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International journal of neural systems

自引率

0.00%

发文量