在脑机接口中使用核反强化学习进行基于隐藏脑状态的内部评估

IF 4.8 2区 医学 Q2 ENGINEERING, BIOMEDICAL
Jieyuan Tan;Xiang Zhang;Shenghui Wu;Zhiwei Song;Yiwen Wang
{"title":"在脑机接口中使用核反强化学习进行基于隐藏脑状态的内部评估","authors":"Jieyuan Tan;Xiang Zhang;Shenghui Wu;Zhiwei Song;Yiwen Wang","doi":"10.1109/TNSRE.2024.3503713","DOIUrl":null,"url":null,"abstract":"Reinforcement learning (RL)-based brain machine interfaces (BMIs) assist paralyzed people in controlling neural prostheses without the need for real limb movement as supervised signals. The design of reward signal significantly impacts the learning efficiency of the RL-based decoders. Existing reward designs in the RL-based BMI framework rely on external rewards or manually labeled internal rewards, unable to accurately extract subjects’ internal evaluation. In this paper, we propose a hidden brain state-based kernel inverse reinforcement learning (HBS-KIRL) method to accurately infer the subject-specific internal evaluation from neural activity during the BMI task. The state-space model is applied to project the neural state into low-dimensional hidden brain state space, which greatly reduces the exploration dimension. Then the kernel method is applied to speed up the convergence of policy, reward, and Q-value networks in reproducing kernel Hilbert space (RKHS). We tested our proposed algorithm on the data collected from the medial prefrontal cortex (mPFC) of rats when they were performing a two-lever-discrimination task. We assessed the state-value estimation performance of our proposed method and compared it with naïve IRL and PCA-based IRL. To validate that the extracted internal evaluation could contribute to the decoder training, we compared the decoding performance of decoders trained by different reward models, including manually designed reward, naïve IRL, PCA-IRL, and our proposed HBS-KIRL. The results show that the HBS-KIRL method can give a stable and accurate estimation of state-value distribution with respect to behavior. Compared with other methods, the decoder guided by HBS-KIRL achieves consistent and better decoding performance over days. This study reveals the potential of applying the IRL method to better extract subject-specific evaluation and improve the BMI decoding performance.","PeriodicalId":13419,"journal":{"name":"IEEE Transactions on Neural Systems and Rehabilitation Engineering","volume":"32 ","pages":"4219-4229"},"PeriodicalIF":4.8000,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10759843","citationCount":"0","resultStr":"{\"title\":\"Hidden Brain State-Based Internal Evaluation Using Kernel Inverse Reinforcement Learning in Brain-Machine Interfaces\",\"authors\":\"Jieyuan Tan;Xiang Zhang;Shenghui Wu;Zhiwei Song;Yiwen Wang\",\"doi\":\"10.1109/TNSRE.2024.3503713\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Reinforcement learning (RL)-based brain machine interfaces (BMIs) assist paralyzed people in controlling neural prostheses without the need for real limb movement as supervised signals. The design of reward signal significantly impacts the learning efficiency of the RL-based decoders. Existing reward designs in the RL-based BMI framework rely on external rewards or manually labeled internal rewards, unable to accurately extract subjects’ internal evaluation. In this paper, we propose a hidden brain state-based kernel inverse reinforcement learning (HBS-KIRL) method to accurately infer the subject-specific internal evaluation from neural activity during the BMI task. The state-space model is applied to project the neural state into low-dimensional hidden brain state space, which greatly reduces the exploration dimension. Then the kernel method is applied to speed up the convergence of policy, reward, and Q-value networks in reproducing kernel Hilbert space (RKHS). We tested our proposed algorithm on the data collected from the medial prefrontal cortex (mPFC) of rats when they were performing a two-lever-discrimination task. We assessed the state-value estimation performance of our proposed method and compared it with naïve IRL and PCA-based IRL. To validate that the extracted internal evaluation could contribute to the decoder training, we compared the decoding performance of decoders trained by different reward models, including manually designed reward, naïve IRL, PCA-IRL, and our proposed HBS-KIRL. The results show that the HBS-KIRL method can give a stable and accurate estimation of state-value distribution with respect to behavior. Compared with other methods, the decoder guided by HBS-KIRL achieves consistent and better decoding performance over days. This study reveals the potential of applying the IRL method to better extract subject-specific evaluation and improve the BMI decoding performance.\",\"PeriodicalId\":13419,\"journal\":{\"name\":\"IEEE Transactions on Neural Systems and Rehabilitation Engineering\",\"volume\":\"32 \",\"pages\":\"4219-4229\"},\"PeriodicalIF\":4.8000,\"publicationDate\":\"2024-11-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10759843\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Neural Systems and Rehabilitation Engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10759843/\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, BIOMEDICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Neural Systems and Rehabilitation Engineering","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10759843/","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0

摘要

基于强化学习(RL)的脑机接口(BMI)可帮助瘫痪者控制神经假肢,而无需真实的肢体运动作为监督信号。奖励信号的设计对基于 RL 的解码器的学习效率有很大影响。基于 RL 的 BMI 框架中的现有奖励设计依赖于外部奖励或人工标注的内部奖励,无法准确提取受试者的内部评价。在本文中,我们提出了一种基于隐脑状态的核反强化学习(HBS-KIRL)方法,以从 BMI 任务中的神经活动中准确推断出特定受试者的内部评价。利用状态空间模型将神经状态投射到低维的隐藏脑状态空间,从而大大降低了探索维度。然后应用核方法加速政策、奖励和 Q 值网络在重现核希尔伯特空间(RKHS)中的收敛。我们利用从大鼠内侧前额叶皮层(mPFC)收集到的数据对所提出的算法进行了测试,当时大鼠正在执行双杠杆辨别任务。我们评估了所提方法的状态值估计性能,并将其与天真 IRL 和基于 PCA 的 IRL 进行了比较。为了验证提取的内部评估是否有助于解码器的训练,我们比较了由不同奖励模型训练的解码器的解码性能,包括人工设计的奖励、天真 IRL、PCA-IRL 和我们提出的 HBS-KIRL。结果表明,HBS-KIRL 方法可以对行为的状态值分布做出稳定而准确的估计。与其他方法相比,HBS-KIRL 引导的解码器能在数天内实现稳定且更好的解码性能。本研究揭示了应用 IRL 方法更好地提取特定主题评价和提高 BMI 解码性能的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Hidden Brain State-Based Internal Evaluation Using Kernel Inverse Reinforcement Learning in Brain-Machine Interfaces
Reinforcement learning (RL)-based brain machine interfaces (BMIs) assist paralyzed people in controlling neural prostheses without the need for real limb movement as supervised signals. The design of reward signal significantly impacts the learning efficiency of the RL-based decoders. Existing reward designs in the RL-based BMI framework rely on external rewards or manually labeled internal rewards, unable to accurately extract subjects’ internal evaluation. In this paper, we propose a hidden brain state-based kernel inverse reinforcement learning (HBS-KIRL) method to accurately infer the subject-specific internal evaluation from neural activity during the BMI task. The state-space model is applied to project the neural state into low-dimensional hidden brain state space, which greatly reduces the exploration dimension. Then the kernel method is applied to speed up the convergence of policy, reward, and Q-value networks in reproducing kernel Hilbert space (RKHS). We tested our proposed algorithm on the data collected from the medial prefrontal cortex (mPFC) of rats when they were performing a two-lever-discrimination task. We assessed the state-value estimation performance of our proposed method and compared it with naïve IRL and PCA-based IRL. To validate that the extracted internal evaluation could contribute to the decoder training, we compared the decoding performance of decoders trained by different reward models, including manually designed reward, naïve IRL, PCA-IRL, and our proposed HBS-KIRL. The results show that the HBS-KIRL method can give a stable and accurate estimation of state-value distribution with respect to behavior. Compared with other methods, the decoder guided by HBS-KIRL achieves consistent and better decoding performance over days. This study reveals the potential of applying the IRL method to better extract subject-specific evaluation and improve the BMI decoding performance.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
8.60
自引率
8.20%
发文量
479
审稿时长
6-12 weeks
期刊介绍: Rehabilitative and neural aspects of biomedical engineering, including functional electrical stimulation, acoustic dynamics, human performance measurement and analysis, nerve stimulation, electromyography, motor control and stimulation; and hardware and software applications for rehabilitation engineering and assistive devices.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信