{"title":"Trade-Off Between Robustness and Rewards Adversarial Training for Deep Reinforcement Learning Under Large Perturbations","authors":"Jeffrey Huang;Ho Jin Choi;Nadia Figueroa","doi":"10.1109/LRA.2023.3324590","DOIUrl":"https://doi.org/10.1109/LRA.2023.3324590","url":null,"abstract":"Deep Reinforcement Learning (DRL) has become a popular approach for training robots due to its generalization promise, complex task capacity and minimal human intervention. Nevertheless, DRL-trained controllers are vulnerable to even the smallest of perturbations on its inputs which can lead to catastrophic failures in real-world human-centric environments with large and unexpected perturbations. In this work, we study the vulnerability of state-of-the-art DRL subject to large perturbations and propose a novel adversarial training framework for robust control. Our approach generates aggressive attacks on the state space and the expected state-action values to emulate real-world perturbations such as sensor noise, perception failures, physical perturbations, observations mismatch, etc. To achieve this, we reformulate the adversarial risk to yield a trade-off between rewards and robustness (TBRR). We show that TBRR-aided DRL training is robust to aggressive attacks and outperforms baselines on standard DRL benchmarks (Cartpole, Pendulum), Meta-World tasks (door manipulation) and a vision-based grasping task with a 7DoF manipulator. Finally, we show that the vision-based grasping task trained in simulation via TBRR transfers sim2real with 70% success rate subject to sensor impairment and physical perturbations without any retraining.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"8 12","pages":"8018-8025"},"PeriodicalIF":5.2,"publicationDate":"2023-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50297618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Wheel Vision: Wheel-Terrain Interaction Measurement and Analysis Using a Sensorized Transparent Wheel on Deformable Terrains","authors":"Chen Yao;Feng Xue;Zhengyin Wang;Ye Yuan;Zheng Zhu;Liang Ding;Zhenzhong Jia","doi":"10.1109/LRA.2023.3324291","DOIUrl":"https://doi.org/10.1109/LRA.2023.3324291","url":null,"abstract":"The off-road locomotion of wheeled mobile robots (WMRs) over soft terrains can be quite challenging due to the complicated wheel-terrain interaction (WTI). To avoid unforeseen non-geometric hazards such as excessive sinkage or slippage, it is crucial to oversee these terrain-related uncertainties. However, determining the appropriate sensing principle for WTI and hazard prediction remains an open problem. This letter showcases an onboard sensorized transparent wheel concept (STW) aiming to explicitly characterize the WTI over deformable terrains for rovers. The STW configuration can provide directly in-wheel interaction views, thereby offering in-wheel measurement (IM) of WTI parameters and observations of soil flow simultaneously. Unlike traditional vision-based methods, this in-situ wheel vision can characterize the entire contact geometry distributions, eliminating complicated yet inaccurate model-based stochastic estimations. Consequently, it can achieve robust and real-time (30 Hz) performance even under complex motions. We conduct representative terrain experiments on a single-wheel testbed to verify the performance of our proposed STW system, and showcase its applicability as a terramechanics test tool to remodel WTI mechanics, as seen in \u0000<uri>https://youtu.be/aYKW1Pp4ENw</uri>\u0000.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"8 12","pages":"7938-7945"},"PeriodicalIF":5.2,"publicationDate":"2023-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50297939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qingfeng Yao;Linghan Meng;Qifeng Zhang;Jing Zhao;Joni Pajarinen;Xiaohui Wang;Zhibin Li;Cong Wang
{"title":"Learning-Based Propulsion Control for Amphibious Quadruped Robots With Dynamic Adaptation to Changing Environment","authors":"Qingfeng Yao;Linghan Meng;Qifeng Zhang;Jing Zhao;Joni Pajarinen;Xiaohui Wang;Zhibin Li;Cong Wang","doi":"10.1109/LRA.2023.3323893","DOIUrl":"https://doi.org/10.1109/LRA.2023.3323893","url":null,"abstract":"This letter proposes a learning-based adaptive propulsion control (APC) method for a quadruped robot integrated with thrusters in amphibious environments, allowing it to move efficiently in water while maintaining its ground locomotion capabilities. We designed the specific reinforcement learning method to train the neural network to perform the vector propulsion control. Our approach coordinates the legs and propeller, enabling the robot to achieve speed and trajectory tracking tasks in the presence of actuator failures and unknown disturbances. Our simulated validations of the robot in water demonstrate the effectiveness of the trained neural network to predict the disturbances and actuator failures based on historical information, showing that the framework is adaptable to changing environments and is suitable for use in dynamically changing situations. Our proposed approach is suited to the hardware augmentation of quadruped robots to create avenues in the field of amphibious robotics and expand the use of quadruped robots in various applications.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"8 12","pages":"7889-7896"},"PeriodicalIF":5.2,"publicationDate":"2023-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50297933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DORF: A Dynamic Object Removal Framework for Robust Static LiDAR Mapping in Urban Environments","authors":"Zhiming Chen;Kun Zhang;Hua Chen;Michael Yu Wang;Wei Zhang;Hongyu Yu","doi":"10.1109/LRA.2023.3323196","DOIUrl":"https://doi.org/10.1109/LRA.2023.3323196","url":null,"abstract":"3D point cloud maps are widely used in robotic tasks like localization and planning. However, dynamic objects, such as cars and pedestrians, can introduce ghost artifacts during the map generation process, leading to reduced map quality and hindering normal robot navigation. Online dynamic object removal methods are restricted to utilize only local scope information and have limited performance. To address this challenge, we propose DORF (Dynamic Object Removal Framework), a novel coarse-to-fine offline framework that exploits global 4D spatial-temporal LiDAR information to achieve clean static point cloud map generation, which reaches the state-of-the-art performance among existing offline methods. DORF first conservatively preserves the definite static points leveraging the Receding Horizon Sampling (RHS) mechanism proposed by us. Then DORF gradually recovers more ambiguous static points, guided by the inherent characteristic of dynamic objects in urban environments which necessitates their interaction with the ground. We validate the effectiveness and robustness of DORF across various types of highly dynamic datasets.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"8 12","pages":"7922-7929"},"PeriodicalIF":5.2,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50297938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Vision-Based Uncertainty-Aware Motion Planning Based on Probabilistic Semantic Segmentation","authors":"Ralf Römer;Armin Lederer;Samuel Tesfazgi;Sandra Hirche","doi":"10.1109/LRA.2023.3322899","DOIUrl":"https://doi.org/10.1109/LRA.2023.3322899","url":null,"abstract":"For safe operation, a robot must be able to avoid collisions in uncertain environments. Existing approaches for motion planning under uncertainties often assume parametric obstacle representations and Gaussian uncertainty, which can be inaccurate. While visual perception can deliver a more accurate representation of the environment, its use for safe motion planning is limited by the inherent miscalibration of neural networks and the challenge of obtaining adequate datasets. To address these limitations, we propose to employ ensembles of deep semantic segmentation networks trained with massively augmented datasets to ensure reliable probabilistic occupancy information. To avoid conservatism during motion planning, we directly employ the probabilistic perception in a scenario-based path planning approach. A velocity scheduling scheme is applied to the path to ensure a safe motion despite tracking inaccuracies. We demonstrate the effectiveness of the massive data augmentation in combination with deep ensembles and the proposed scenario-based planning approach in comparisons to state-of-the-art methods and validate our framework in an experiment with a human hand as an obstacle.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"8 11","pages":"7825-7832"},"PeriodicalIF":5.2,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50248258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Minggang Gan;Jinting Liu;Yuxuan He;Aobo Chen;Qianzhao Ma
{"title":"Keyframe Selection Via Deep Reinforcement Learning for Skeleton-Based Gesture Recognition","authors":"Minggang Gan;Jinting Liu;Yuxuan He;Aobo Chen;Qianzhao Ma","doi":"10.1109/LRA.2023.3322645","DOIUrl":"https://doi.org/10.1109/LRA.2023.3322645","url":null,"abstract":"Skeleton-based gesture recognition has attracted extensive attention and has made great progress. However, mainstream methods generally treat all frames as equally important, which may limit performance, especially when dealing with high inter-class variance in gesture. To tackle this issue, we propose an approach that models a Markov decision process to identify keyframes while discarding irrelevant ones. This article proposes a deep reinforcement learning double-feature double-motion network comprising two main components: a baseline gesture recognition model and a frame selection network. These two components mutually influence each other, resulting in enhanced overall performance. Following the evaluation of the SHREC-17 and F-PHAB datasets, our proposed method demonstrates superior performance.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"8 11","pages":"7807-7814"},"PeriodicalIF":5.2,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50248394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adaptation of Flipper-Mud Interactions Enables Effective Terrestrial Locomotion on Muddy Substrates","authors":"Shipeng Liu;Boyuan Huang;Feifei Qian","doi":"10.1109/LRA.2023.3323123","DOIUrl":"https://doi.org/10.1109/LRA.2023.3323123","url":null,"abstract":"Moving on natural muddy terrains, where soil composition and water content vary significantly, is complex and challenging. To understand how mud properties and robot-mud interaction strategies affect locomotion performance on mud, we study the terrestrial locomotion of a mudskipper-inspired robot on synthetic mud with precisely-controlled ratios of sand, clay, and water. We observed a non-monotonic dependence of the robot speed on mud water content. Robot speed was the largest on mud with intermediate levels of water content (25%–26%), but decreased significantly on higher or lower water content. Measurements of mud reaction force revealed two distinct failure mechanisms. At high water content, the reduced mud shear strength led to a large slippage of robot appendages and a significantly reduced step length. At low water content, the increased mud suction force caused appendage entrapment, resulting in a large negative displacement in the robot body during the swing phase. A simple model successfully captured the observed robot performance, and informed adaptation strategies that increased robot speed by more than 200%. Our study is a beginning step to extend robot mobility beyond simple substrates towards a wider range of complex, heterogeneous terrains.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"8 12","pages":"7978-7985"},"PeriodicalIF":5.2,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50297613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Pixel-Level Collision-Free Grasp Prediction Network for Medical Test Tube Sorting on Cluttered Trays","authors":"Shihao Ge;Beiping Hou;Wen Zhu;Yuzhen Zhu;Senjian Lu;Yangbin Zheng","doi":"10.1109/LRA.2023.3322896","DOIUrl":"https://doi.org/10.1109/LRA.2023.3322896","url":null,"abstract":"Robotic sorting shows a promising aspect for future developments in medical field. However, vision-based grasp detection of medical devices is usually in unstructured or cluttered environments, which raises major challenges for the development of robotic sorting systems. In this letter, a pixel-level grasp detection method is proposed to predict the optimal collision-free grasp configuration on RGB images. First, an Adaptive Grasp Flex Classify (AGFC) model is introduced to add category attributes to distinguish test tube arrangements in complex scenarios. Then, we propose an end-to-end trainable CNN-based architecture, which delivers high quality results for grasp detection and avoids the confusion in neural network learning, to generate the AGFC-model. Utilizing this, we design a Residual Efficient Atrous Spatial Pyramid (REASP) block to further increase the accuracy of grasp detection. Finally, a collision-free manipulation policy is designed to guide the robot to grasp. Experiments on various scenarios are implemented to illustrate the robustness and the effectiveness of our approach, and a robotic grasping platform is constructed to evaluate its application performance. Overall, the developed robotic sorting system achieves a success rate of 95% on test tube sorting.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"8 12","pages":"7897-7904"},"PeriodicalIF":5.2,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50297934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Correction to “Improving Multi-Agent Trajectory Prediction Using Traffic States on Interactive Driving Scenarios”","authors":"Chalavadi Vishnu;Gottala Vineel Abhinav;Debaditya Roy;C. Krishna Mohan;Ch. Sobhan Babu","doi":"10.1109/LRA.2023.3320320","DOIUrl":"https://doi.org/10.1109/LRA.2023.3320320","url":null,"abstract":"Presents corrections to the article “Improving Multi-Agent Trajectory Prediction Using Traffic States on Interactive Driving Scenarios”.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"8 11","pages":"7519-7519"},"PeriodicalIF":5.2,"publicationDate":"2023-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/7083369/10254630/10273669.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50247666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Conformal Predictive Safety Filter for RL Controllers in Dynamic Environments","authors":"Kegan J. Strawn;Nora Ayanian;Lars Lindemann","doi":"10.1109/LRA.2023.3322644","DOIUrl":"https://doi.org/10.1109/LRA.2023.3322644","url":null,"abstract":"The interest in using reinforcement learning (RL) controllers in safety-critical applications such as robot navigation around pedestrians motivates the development of additional safety mechanisms. Running RL-enabled systems among uncertain dynamic agents may result in high counts of collisions and failures to reach the goal. The system could be safer if the pre-trained RL policy was uncertainty-informed. For that reason, we propose \u0000<italic>conformal predictive safety filters</i>\u0000 that: 1) predict the other agents' trajectories, 2) use statistical techniques to provide uncertainty intervals around these predictions, and 3) learn an additional safety filter that closely follows the RL controller but avoids the uncertainty intervals. We use conformal prediction to learn uncertainty-informed predictive safety filters, which make no assumptions about the agents' distribution. The framework is modular and outperforms the existing controllers in simulation. We demonstrate our approach with multiple experiments in a collision avoidance gym environment and show that our approach minimizes the number of collisions without making overly conservative predictions.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"8 11","pages":"7833-7840"},"PeriodicalIF":5.2,"publicationDate":"2023-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50248392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}