{"title":"A Prompt-Guided Generative Language Model for Unifying Visual Neural Decoding Across Multiple Subjects and Tasks.","authors":"Wei Huang, Hengjiang Li, Fan Qin, Diwei Wu, Kaiwen Cheng, Huafu Chen","doi":"10.1142/S0129065725500686","DOIUrl":null,"url":null,"abstract":"<p><p>Visual neural decoding not only aids in elucidating the neural mechanisms underlying the processing of visual information but also facilitates the advancement of brain-computer interface technologies. However, most current decoding studies focus on developing separate decoding models for individual subjects and specific tasks, an approach that escalates training costs and consumes a substantial amount of computational resources. This paper introduces a Prompt-Guided Generative Visual Language Decoding Model (PG-GVLDM), which uses prompt text that includes information about subjects and tasks to decode both primary categories and detailed textual descriptions from the visual response activities of multiple individuals. In addition to visual response activities, this study also incorporates a multi-head cross-attention module and feeds the model with whole-brain response activities to capture global semantic information in the brain. Experiments on the Natural Scenes Dataset (NSD) demonstrate that PG-GVLDM attains an average category decoding accuracy of 66.6% across four subjects, reflecting strong cross-subject generalization, and achieves text decoding scores of 0.342 (METEOR), 0.450 (Sentence-Transformer), 0.283 (ROUGE-1), and 0.262 (ROUGE-L), establishing state-of-the-art performance in text decoding. Furthermore, incorporating whole-brain response activities significantly enhances decoding performance by enabling the integration of distributed neural signals into coherent global semantic representations, underscoring its methodological importance for unified neural decoding. This research not only represents a breakthrough in visual neural decoding methodologies but also provides theoretical and technical support for the development of generalized brain-computer interfaces.</p>","PeriodicalId":94052,"journal":{"name":"International journal of neural systems","volume":" ","pages":"2550068"},"PeriodicalIF":6.4000,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of neural systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/S0129065725500686","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Visual neural decoding not only aids in elucidating the neural mechanisms underlying the processing of visual information but also facilitates the advancement of brain-computer interface technologies. However, most current decoding studies focus on developing separate decoding models for individual subjects and specific tasks, an approach that escalates training costs and consumes a substantial amount of computational resources. This paper introduces a Prompt-Guided Generative Visual Language Decoding Model (PG-GVLDM), which uses prompt text that includes information about subjects and tasks to decode both primary categories and detailed textual descriptions from the visual response activities of multiple individuals. In addition to visual response activities, this study also incorporates a multi-head cross-attention module and feeds the model with whole-brain response activities to capture global semantic information in the brain. Experiments on the Natural Scenes Dataset (NSD) demonstrate that PG-GVLDM attains an average category decoding accuracy of 66.6% across four subjects, reflecting strong cross-subject generalization, and achieves text decoding scores of 0.342 (METEOR), 0.450 (Sentence-Transformer), 0.283 (ROUGE-1), and 0.262 (ROUGE-L), establishing state-of-the-art performance in text decoding. Furthermore, incorporating whole-brain response activities significantly enhances decoding performance by enabling the integration of distributed neural signals into coherent global semantic representations, underscoring its methodological importance for unified neural decoding. This research not only represents a breakthrough in visual neural decoding methodologies but also provides theoretical and technical support for the development of generalized brain-computer interfaces.