Receding-Horizon Reinforcement Learning for Time-Delayed Human–Machine Shared Control of Intelligent Vehicles

IF 4.4 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Human-Machine Systems Pub Date : 2025-01-16 DOI:10.1109/THMS.2024.3496899

Xinxin Yao;Jiahang Liu;Xinglong Zhang;Xin Xu

{"title":"Receding-Horizon Reinforcement Learning for Time-Delayed Human–Machine Shared Control of Intelligent Vehicles","authors":"Xinxin Yao;Jiahang Liu;Xinglong Zhang;Xin Xu","doi":"10.1109/THMS.2024.3496899","DOIUrl":null,"url":null,"abstract":"Human–machine shared control has recently been regarded as a promising paradigm to improve safety and performance in complex driving scenarios. One crucial task in shared control is dynamically optimizing the driving weights between the driver and the intelligent vehicle to adapt to dynamic driving scenarios. However, designing an optimal human–machine shared controller with guaranteed performance and stability is challenging due to nonnegligible time delays caused by communication protocols and uncertainties in driver behavior. This article proposes a novel receding-horizon reinforcement learning approach for time-delayed human–machine shared control of intelligent vehicles. First, we build a multikernel-based data-driven model of vehicle dynamics and driving behavior, considering time delays and uncertainties of drivers' actions. Second, a model-based receding horizon actor–critic learning algorithm is presented to learn an explicit policy for time-delayed human–machine shared control online. Unlike classic reinforcement learning, policy learning of the proposed approach is performed according to a receding-horizon strategy to enhance learning efficiency and adaptability. In theory, the closed-loop stability under time delays is analyzed. Hardware-in-the-loop experiments on the time-delayed human–machine shared control of intelligent vehicles have been conducted in variable curvature road scenarios. The results demonstrate that our approach has significant improvements in driving performance and driver workload compared with pure manual driving and previous shared control methods.","PeriodicalId":48916,"journal":{"name":"IEEE Transactions on Human-Machine Systems","volume":"55 2","pages":"155-164"},"PeriodicalIF":4.4000,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Human-Machine Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10844015/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Human–machine shared control has recently been regarded as a promising paradigm to improve safety and performance in complex driving scenarios. One crucial task in shared control is dynamically optimizing the driving weights between the driver and the intelligent vehicle to adapt to dynamic driving scenarios. However, designing an optimal human–machine shared controller with guaranteed performance and stability is challenging due to nonnegligible time delays caused by communication protocols and uncertainties in driver behavior. This article proposes a novel receding-horizon reinforcement learning approach for time-delayed human–machine shared control of intelligent vehicles. First, we build a multikernel-based data-driven model of vehicle dynamics and driving behavior, considering time delays and uncertainties of drivers' actions. Second, a model-based receding horizon actor–critic learning algorithm is presented to learn an explicit policy for time-delayed human–machine shared control online. Unlike classic reinforcement learning, policy learning of the proposed approach is performed according to a receding-horizon strategy to enhance learning efficiency and adaptability. In theory, the closed-loop stability under time delays is analyzed. Hardware-in-the-loop experiments on the time-delayed human–machine shared control of intelligent vehicles have been conducted in variable curvature road scenarios. The results demonstrate that our approach has significant improvements in driving performance and driver workload compared with pure manual driving and previous shared control methods.

查看原文本刊更多论文

智能车辆时滞人机共享控制的后退地平线强化学习

最近，人机共享控制被认为是一种有前途的范例，可以提高复杂驾驶场景的安全性和性能。共享控制的一个关键任务是动态优化驾驶员与智能车辆之间的驾驶权，以适应动态驾驶场景。然而，由于通信协议造成的不可忽略的时间延迟和驱动程序行为的不确定性，设计具有保证性能和稳定性的最优人机共享控制器具有挑战性。针对智能车辆的时滞人机共享控制问题，提出了一种新的后退视界强化学习方法。首先，我们建立了基于多核的车辆动力学和驾驶行为数据驱动模型，考虑了驾驶员行为的时滞和不确定性。其次，提出了一种基于模型的后退视界行为者-批评家学习算法，用于学习在线延迟人机共享控制的显式策略。与传统的强化学习不同，该方法的策略学习是根据后退视界策略进行的，以提高学习效率和适应性。从理论上分析了时滞条件下的闭环稳定性。在变曲率道路场景下进行了智能车辆延时人机共享控制的硬件在环实验。结果表明，与纯手动驾驶和以前的共享控制方法相比，我们的方法在驾驶性能和驾驶员工作量方面有显着改善。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Human-Machine Systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, CYBERNETICS

CiteScore

7.10

自引率

11.10%

发文量

136

期刊介绍： The scope of the IEEE Transactions on Human-Machine Systems includes the fields of human machine systems. It covers human systems and human organizational interactions including cognitive ergonomics, system test and evaluation, and human information processing concerns in systems and organizations.