{"title":"Safe Reinforcement Learning for Autonomous Driving by Using Disturbance-Observer-Based Control Barrier Functions","authors":"Zhengyu Hou;Wenjun Liu;Alois Knoll","doi":"10.1109/TIV.2024.3463468","DOIUrl":null,"url":null,"abstract":"Recently, reinforcement learning (RL) has been increasingly used in autonomous driving (AD) navigation control systems. However, most RL-based AD navigation control systems remain in the simulation stage. Its practical application is limited due to growing safety concerns. The safety of these algorithms remains uncertain when confronted with real-world disturbances and vehicle model uncertainties. To enhance the safety of RL, we propose a disturbance observer (DOB) based safe soft actor-critic (SAC) algorithm that combines the SAC algorithm with a safety constraints filter composed of DOB and control barrier function (CBF). When the SAC agent's action output is unsafe, the safety constraints filter will alter it. We employ a DOB to accurately estimate the difference between the nominal model of the vehicle and the actual model, i.e., the lumped disturbances. Then, a more accurate vehicle model can be obtained. To ensure the safety of DOB-SAC under complex and dynamically changing environmental conditions, a further predictive safety constraint is defined based on model predictive control (MPC) ideas. The safe action will be rendered using safety-critical optimal control according to the DOB compensated vehicle model, CBF, and the predictive safety constraints. We discuss the SAC architecture and training details, and investigate the effectiveness of CBF in modeling safety constraints. Joint simulations are conducted in scenarios with static obstacles and intersection scenes with dynamic obstacles.","PeriodicalId":36532,"journal":{"name":"IEEE Transactions on Intelligent Vehicles","volume":"10 6","pages":"3782-3791"},"PeriodicalIF":14.3000,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Intelligent Vehicles","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10684598/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Recently, reinforcement learning (RL) has been increasingly used in autonomous driving (AD) navigation control systems. However, most RL-based AD navigation control systems remain in the simulation stage. Its practical application is limited due to growing safety concerns. The safety of these algorithms remains uncertain when confronted with real-world disturbances and vehicle model uncertainties. To enhance the safety of RL, we propose a disturbance observer (DOB) based safe soft actor-critic (SAC) algorithm that combines the SAC algorithm with a safety constraints filter composed of DOB and control barrier function (CBF). When the SAC agent's action output is unsafe, the safety constraints filter will alter it. We employ a DOB to accurately estimate the difference between the nominal model of the vehicle and the actual model, i.e., the lumped disturbances. Then, a more accurate vehicle model can be obtained. To ensure the safety of DOB-SAC under complex and dynamically changing environmental conditions, a further predictive safety constraint is defined based on model predictive control (MPC) ideas. The safe action will be rendered using safety-critical optimal control according to the DOB compensated vehicle model, CBF, and the predictive safety constraints. We discuss the SAC architecture and training details, and investigate the effectiveness of CBF in modeling safety constraints. Joint simulations are conducted in scenarios with static obstacles and intersection scenes with dynamic obstacles.
期刊介绍:
The IEEE Transactions on Intelligent Vehicles (T-IV) is a premier platform for publishing peer-reviewed articles that present innovative research concepts, application results, significant theoretical findings, and application case studies in the field of intelligent vehicles. With a particular emphasis on automated vehicles within roadway environments, T-IV aims to raise awareness of pressing research and application challenges.
Our focus is on providing critical information to the intelligent vehicle community, serving as a dissemination vehicle for IEEE ITS Society members and others interested in learning about the state-of-the-art developments and progress in research and applications related to intelligent vehicles. Join us in advancing knowledge and innovation in this dynamic field.