{"title":"约束流形上的安全强化学习:理论与应用","authors":"Puze Liu;Haitham Bou-Ammar;Jan Peters;Davide Tateo","doi":"10.1109/TRO.2025.3567477","DOIUrl":null,"url":null,"abstract":"Integrating learning-based techniques, especially reinforcement learning, into robotics is promising for solving complex problems in unstructured environments. Most of the existing approaches rely on training in carefully calibrated simulators before being deployed on real robots, often without real-world fine-tuning. While effective in controlled settings, this framework falls short in applications where precise simulation is unavailable or the environment is too complex to model. Instead, on-robot learning, which learns by interacting directly with the real world, offers a promising alternative. One major problem for on-robot reinforcement learning is ensuring safety, as uncontrolled exploration can cause catastrophic damage to the robot or the environment. Indeed, safety specifications, often represented as constraints, can be complex and nonlinear, making safety challenging to guarantee in learning systems. In this article, we show how we can impose complex safety constraints on learning-based robotics systems in a principled manner, both from theoretical and practical points of view. Our approach is based on the concept of the constraint manifold, representing the set of safe robot configurations. Exploiting differential geometry techniques, i.e., the tangent space, we can construct a safe action space, allowing learning agents to sample arbitrary actions while ensuring safety. We demonstrate the method's effectiveness in a real-world robot air hockey task, showing that our method can handle high-dimensional tasks with complex constraints.","PeriodicalId":50388,"journal":{"name":"IEEE Transactions on Robotics","volume":"41 ","pages":"3442-3461"},"PeriodicalIF":9.4000,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Safe Reinforcement Learning on the Constraint Manifold: Theory and Applications\",\"authors\":\"Puze Liu;Haitham Bou-Ammar;Jan Peters;Davide Tateo\",\"doi\":\"10.1109/TRO.2025.3567477\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Integrating learning-based techniques, especially reinforcement learning, into robotics is promising for solving complex problems in unstructured environments. Most of the existing approaches rely on training in carefully calibrated simulators before being deployed on real robots, often without real-world fine-tuning. While effective in controlled settings, this framework falls short in applications where precise simulation is unavailable or the environment is too complex to model. Instead, on-robot learning, which learns by interacting directly with the real world, offers a promising alternative. One major problem for on-robot reinforcement learning is ensuring safety, as uncontrolled exploration can cause catastrophic damage to the robot or the environment. Indeed, safety specifications, often represented as constraints, can be complex and nonlinear, making safety challenging to guarantee in learning systems. In this article, we show how we can impose complex safety constraints on learning-based robotics systems in a principled manner, both from theoretical and practical points of view. Our approach is based on the concept of the constraint manifold, representing the set of safe robot configurations. Exploiting differential geometry techniques, i.e., the tangent space, we can construct a safe action space, allowing learning agents to sample arbitrary actions while ensuring safety. We demonstrate the method's effectiveness in a real-world robot air hockey task, showing that our method can handle high-dimensional tasks with complex constraints.\",\"PeriodicalId\":50388,\"journal\":{\"name\":\"IEEE Transactions on Robotics\",\"volume\":\"41 \",\"pages\":\"3442-3461\"},\"PeriodicalIF\":9.4000,\"publicationDate\":\"2025-03-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Robotics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10989544/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ROBOTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Robotics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10989544/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ROBOTICS","Score":null,"Total":0}
Safe Reinforcement Learning on the Constraint Manifold: Theory and Applications
Integrating learning-based techniques, especially reinforcement learning, into robotics is promising for solving complex problems in unstructured environments. Most of the existing approaches rely on training in carefully calibrated simulators before being deployed on real robots, often without real-world fine-tuning. While effective in controlled settings, this framework falls short in applications where precise simulation is unavailable or the environment is too complex to model. Instead, on-robot learning, which learns by interacting directly with the real world, offers a promising alternative. One major problem for on-robot reinforcement learning is ensuring safety, as uncontrolled exploration can cause catastrophic damage to the robot or the environment. Indeed, safety specifications, often represented as constraints, can be complex and nonlinear, making safety challenging to guarantee in learning systems. In this article, we show how we can impose complex safety constraints on learning-based robotics systems in a principled manner, both from theoretical and practical points of view. Our approach is based on the concept of the constraint manifold, representing the set of safe robot configurations. Exploiting differential geometry techniques, i.e., the tangent space, we can construct a safe action space, allowing learning agents to sample arbitrary actions while ensuring safety. We demonstrate the method's effectiveness in a real-world robot air hockey task, showing that our method can handle high-dimensional tasks with complex constraints.
期刊介绍:
The IEEE Transactions on Robotics (T-RO) is dedicated to publishing fundamental papers covering all facets of robotics, drawing on interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, and beyond. From industrial applications to service and personal assistants, surgical operations to space, underwater, and remote exploration, robots and intelligent machines play pivotal roles across various domains, including entertainment, safety, search and rescue, military applications, agriculture, and intelligent vehicles.
Special emphasis is placed on intelligent machines and systems designed for unstructured environments, where a significant portion of the environment remains unknown and beyond direct sensing or control.