{"title":"Adaptive Safety-Certified Reinforcement Learning for Constrained Optimal Control of Autonomous Robots With Uncertainties","authors":"Fei Zhang;Guang-Hong Yang","doi":"10.1109/JIOT.2025.3554521","DOIUrl":null,"url":null,"abstract":"This article investigates a constrained optimal control problem for safety-critical robots with parametric uncertainties. A novel adaptive safety-certified reinforcement learning (RL) algorithm is proposed, leveraging control barrier functions (CBFs) to enable safe learning of the optimal policy during the online exploration phase. Specifically, a high-order robust adaptive CBF is presented to minimally adjust RL-derived control actions by incorporating a prescribed-time adaptation law to handle the unknown system parameters. This way directly enforces forward invariance, allowing the shrunken safe set to near the standard set within a user-prescribed time. Moreover, a novel adaptive critic learning frame is presented by introducing filtered auxiliary signals that integrate both instantaneous and historical data, which relaxes the strict persistent excitation (PE) condition required in the existing RL methods to a weaker, easily verifiable finite excitation (FE) condition. Later, a prescribed-time learning rule is developed to accelerate the convergence of weights. The key advantage of the proposed way is the decoupling of safety and RL convergence, enabling each component to be managed separately, thereby offering stronger safety certifications compared to the existing RL schemes even under uncertain dynamics. The effectiveness and superiority of the proposed scheme are proven via simulations for surveillance and regulation tasks of autonomous robots.","PeriodicalId":54347,"journal":{"name":"IEEE Internet of Things Journal","volume":"12 13","pages":"23154-23168"},"PeriodicalIF":8.9000,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Internet of Things Journal","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10947350/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
This article investigates a constrained optimal control problem for safety-critical robots with parametric uncertainties. A novel adaptive safety-certified reinforcement learning (RL) algorithm is proposed, leveraging control barrier functions (CBFs) to enable safe learning of the optimal policy during the online exploration phase. Specifically, a high-order robust adaptive CBF is presented to minimally adjust RL-derived control actions by incorporating a prescribed-time adaptation law to handle the unknown system parameters. This way directly enforces forward invariance, allowing the shrunken safe set to near the standard set within a user-prescribed time. Moreover, a novel adaptive critic learning frame is presented by introducing filtered auxiliary signals that integrate both instantaneous and historical data, which relaxes the strict persistent excitation (PE) condition required in the existing RL methods to a weaker, easily verifiable finite excitation (FE) condition. Later, a prescribed-time learning rule is developed to accelerate the convergence of weights. The key advantage of the proposed way is the decoupling of safety and RL convergence, enabling each component to be managed separately, thereby offering stronger safety certifications compared to the existing RL schemes even under uncertain dynamics. The effectiveness and superiority of the proposed scheme are proven via simulations for surveillance and regulation tasks of autonomous robots.
期刊介绍:
The EEE Internet of Things (IoT) Journal publishes articles and review articles covering various aspects of IoT, including IoT system architecture, IoT enabling technologies, IoT communication and networking protocols such as network coding, and IoT services and applications. Topics encompass IoT's impacts on sensor technologies, big data management, and future internet design for applications like smart cities and smart homes. Fields of interest include IoT architecture such as things-centric, data-centric, service-oriented IoT architecture; IoT enabling technologies and systematic integration such as sensor technologies, big sensor data management, and future Internet design for IoT; IoT services, applications, and test-beds such as IoT service middleware, IoT application programming interface (API), IoT application design, and IoT trials/experiments; IoT standardization activities and technology development in different standard development organizations (SDO) such as IEEE, IETF, ITU, 3GPP, ETSI, etc.