A Safe, high-precision reinforcement learning-based optimal control of surgical continuum robots: A monotone tube boundary approach with prescribed-time control capability

IF 4.3 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Robotics and Autonomous Systems Pub Date : 2025-03-20 DOI:10.1016/j.robot.2025.104992

Mohammad Jabari , Andrea Botta , Luigi Tagliavini , Carmen Visconte , Giuseppe Quaglia

{"title":"A Safe, high-precision reinforcement learning-based optimal control of surgical continuum robots: A monotone tube boundary approach with prescribed-time control capability","authors":"Mohammad Jabari , Andrea Botta , Luigi Tagliavini , Carmen Visconte , Giuseppe Quaglia","doi":"10.1016/j.robot.2025.104992","DOIUrl":null,"url":null,"abstract":"<div><div>This paper introduces a novel approach to the prescribed-time control of continuum surgical robots, focusing on four key areas: enhanced system safety, tailored transient tracking, steady-state tracking enhancement, and optimal learned control. The main contribution is the application of system state constraints on tracking error, transforming these constraints into an unconstrained problem using a monotone tube boundary. This method avoids the complexity of Model Predictive Control (MPC) and Control Barrier Functions (CBF) techniques, as well as the conservatism and fixed-boundary issues associated with the Barrier Lyapunov Function (BLF) method. By using a monotone tube boundary, the approach allows for the pre-assignment of transient characteristics for tracking error, avoiding excessive overshoot and lack of adjustability seen with the Prescribed Performance Function (PPF). The prescribed-time control philosophy enables pre-determination of settling time, enhancing precision and convergence rates essential for surgical applications. Additionally, an optimized prescribed-time control strategy using an actor-critic neural network-based Reinforcement Learning (RL) approach ensures controller optimality, reducing control effort, power consumption, and heat generation in the robot's actuators. The method adapts to dynamic environments, ensuring robust performance in various surgical scenarios. Simulation results on a two-segment continuum robot demonstrate the proposed method's advantages over state-of-the-art techniques.</div></div>","PeriodicalId":49592,"journal":{"name":"Robotics and Autonomous Systems","volume":"190 ","pages":"Article 104992"},"PeriodicalIF":4.3000,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Robotics and Autonomous Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0921889025000788","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

This paper introduces a novel approach to the prescribed-time control of continuum surgical robots, focusing on four key areas: enhanced system safety, tailored transient tracking, steady-state tracking enhancement, and optimal learned control. The main contribution is the application of system state constraints on tracking error, transforming these constraints into an unconstrained problem using a monotone tube boundary. This method avoids the complexity of Model Predictive Control (MPC) and Control Barrier Functions (CBF) techniques, as well as the conservatism and fixed-boundary issues associated with the Barrier Lyapunov Function (BLF) method. By using a monotone tube boundary, the approach allows for the pre-assignment of transient characteristics for tracking error, avoiding excessive overshoot and lack of adjustability seen with the Prescribed Performance Function (PPF). The prescribed-time control philosophy enables pre-determination of settling time, enhancing precision and convergence rates essential for surgical applications. Additionally, an optimized prescribed-time control strategy using an actor-critic neural network-based Reinforcement Learning (RL) approach ensures controller optimality, reducing control effort, power consumption, and heat generation in the robot's actuators. The method adapts to dynamic environments, ensuring robust performance in various surgical scenarios. Simulation results on a two-segment continuum robot demonstrate the proposed method's advantages over state-of-the-art techniques.

查看原文本刊更多论文

基于安全、高精度强化学习的外科连续体机器人最优控制：具有规定时间控制能力的单调管边界方法

本文介绍了一种新的连续外科手术机器人的规定时间控制方法，重点关注四个关键领域：增强系统安全性，定制瞬态跟踪，稳态跟踪增强和最优学习控制。主要贡献是将系统状态约束应用于跟踪误差，将这些约束转化为使用单调管边界的无约束问题。该方法既避免了模型预测控制（MPC）和控制障碍函数（CBF）技术的复杂性，又避免了障碍李雅普诺夫函数（BLF）方法的保守性和固定边界问题。通过使用单调管边界，该方法允许预分配跟踪误差的瞬态特性，避免过度超调和缺乏规定性能函数（PPF）的可调节性。规定的时间控制哲学使预先确定的沉降时间，提高精度和收敛率必不可少的手术应用。此外，优化的规定时间控制策略使用基于actor-critic神经网络的强化学习（RL）方法，确保了控制器的最优性，减少了机器人执行器的控制工作量、功耗和热量产生。该方法适应动态环境，确保在各种手术场景下的稳健性能。在两段连续体机器人上的仿真结果表明了该方法相对于现有技术的优越性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Robotics and Autonomous Systems 工程技术-机器人学

CiteScore

9.00

自引率

7.00%

发文量

164

审稿时长

4.5 months

期刊介绍： Robotics and Autonomous Systems will carry articles describing fundamental developments in the field of robotics, with special emphasis on autonomous systems. An important goal of this journal is to extend the state of the art in both symbolic and sensory based robot control and learning in the context of autonomous systems. Robotics and Autonomous Systems will carry articles on the theoretical, computational and experimental aspects of autonomous systems, or modules of such systems.