{"title":"Towards Online Safety Corrections for Robotic Manipulation Policies","authors":"Ariana Spalter, Mark Roberts, Laura M. Hiatt","doi":"arxiv-2409.08233","DOIUrl":null,"url":null,"abstract":"Recent successes in applying reinforcement learning (RL) for robotics has\nshown it is a viable approach for constructing robotic controllers. However, RL\ncontrollers can produce many collisions in environments where new obstacles\nappear during execution. This poses a problem in safety-critical settings. We\npresent a hybrid approach, called iKinQP-RL, that uses an Inverse Kinematics\nQuadratic Programming (iKinQP) controller to correct actions proposed by an RL\npolicy at runtime. This ensures safe execution in the presence of new obstacles\nnot present during training. Preliminary experiments illustrate our iKinQP-RL\nframework completely eliminates collisions with new obstacles while maintaining\na high task success rate.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Robotics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.08233","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Recent successes in applying reinforcement learning (RL) for robotics has
shown it is a viable approach for constructing robotic controllers. However, RL
controllers can produce many collisions in environments where new obstacles
appear during execution. This poses a problem in safety-critical settings. We
present a hybrid approach, called iKinQP-RL, that uses an Inverse Kinematics
Quadratic Programming (iKinQP) controller to correct actions proposed by an RL
policy at runtime. This ensures safe execution in the presence of new obstacles
not present during training. Preliminary experiments illustrate our iKinQP-RL
framework completely eliminates collisions with new obstacles while maintaining
a high task success rate.