Human Cognitive Learning in Shared Control via Differential Game With Bounded Rationality and Incomplete Information

IEEE transactions on artificial intelligence Pub Date : 2024-06-18 DOI:10.1109/TAI.2024.3415549

Huai-Ning Wu;Xiao-Yan Jiang;Mi Wang

{"title":"Human Cognitive Learning in Shared Control via Differential Game With Bounded Rationality and Incomplete Information","authors":"Huai-Ning Wu;Xiao-Yan Jiang;Mi Wang","doi":"10.1109/TAI.2024.3415549","DOIUrl":null,"url":null,"abstract":"Since human beings are of limited reasoning ability as well as the machines do not usually know human intentions, how to learn human cognitive levels in shared control to enhance the machines’ intelligence is a challenging issue. In this study, this issue is addressed in the context of human–machine shared control for a class of human-in-the-loop (HiTL) systems based on a differential game with bounded rationality and incomplete information. Initially, we formulate the human–machine shared control problem as a two-player nonzero-sum linear quadratic dynamic game (LQDG), where the weighting matrix of the cost function representing the human intention is unknown for the machine. To model the human bounded rationality, the level-\n<inline-formula><tex-math>$\\boldsymbol{k}$</tex-math></inline-formula>\n (LK) approach is employed to set up the LK control policies of two players performing the corresponding steps of strategic thinking. To infer the human intention, an online adaptive inverse optimal control (IOC) algorithm is then developed by using the system state data, so that the control policies of different cognitive levels can be computed. In addition, a reinforcement learning method is proposed for the machine to identify the distribution of the human cognitive levels while providing a proactive collaborative control to assist the human in a probabilistic switching way. Finally, simulation results on a cooperative shared control driver assistance system (DAS) illustrate the efficacy of the proposed approach.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 10","pages":"5141-5152"},"PeriodicalIF":0.0000,"publicationDate":"2024-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10562193/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Since human beings are of limited reasoning ability as well as the machines do not usually know human intentions, how to learn human cognitive levels in shared control to enhance the machines’ intelligence is a challenging issue. In this study, this issue is addressed in the context of human–machine shared control for a class of human-in-the-loop (HiTL) systems based on a differential game with bounded rationality and incomplete information. Initially, we formulate the human–machine shared control problem as a two-player nonzero-sum linear quadratic dynamic game (LQDG), where the weighting matrix of the cost function representing the human intention is unknown for the machine. To model the human bounded rationality, the level-

$\boldsymbol{k}$

(LK) approach is employed to set up the LK control policies of two players performing the corresponding steps of strategic thinking. To infer the human intention, an online adaptive inverse optimal control (IOC) algorithm is then developed by using the system state data, so that the control policies of different cognitive levels can be computed. In addition, a reinforcement learning method is proposed for the machine to identify the distribution of the human cognitive levels while providing a proactive collaborative control to assist the human in a probabilistic switching way. Finally, simulation results on a cooperative shared control driver assistance system (DAS) illustrate the efficacy of the proposed approach.

查看原文本刊更多论文

通过有限理性和不完全信息的差分博弈实现共享控制中的人类认知学习

由于人类的推理能力有限，而且机器通常不知道人类的意图，因此如何在共享控制中学习人类的认知水平以提高机器的智能是一个具有挑战性的问题。在本研究中，我们以一类基于有界理性和不完全信息的微分博弈的人在回路（HiTL）系统的人机共享控制为背景，探讨了这一问题。首先，我们将人机共享控制问题表述为双人非零和线性二次动态博弈（LQDG），其中代表人类意图的成本函数的加权矩阵对机器来说是未知的。为了模拟人类的有界理性，我们采用了水平-$\boldsymbol{k}$（LK）方法来设定两个执行相应战略思维步骤的玩家的 LK 控制策略。为了推断人类的意图，利用系统状态数据开发了在线自适应反最优控制（IOC）算法，从而计算出不同认知水平的控制策略。此外，还提出了一种强化学习方法，让机器识别人类认知水平的分布，同时提供主动协作控制，以概率切换的方式协助人类。最后，合作共享控制驾驶员辅助系统（DAS）的仿真结果表明了所提方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on artificial intelligence

CiteScore

7.70

自引率

0.00%

发文量