Autonomous RobotsPub Date : 2024-06-04DOI: 10.1007/s10514-024-10164-6
Abhishek Padalkar, Gabriel Quere, Antonin Raffin, João Silvério, Freek Stulp
{"title":"Guiding real-world reinforcement learning for in-contact manipulation tasks with Shared Control Templates","authors":"Abhishek Padalkar, Gabriel Quere, Antonin Raffin, João Silvério, Freek Stulp","doi":"10.1007/s10514-024-10164-6","DOIUrl":"10.1007/s10514-024-10164-6","url":null,"abstract":"<div><p>The requirement for a high number of training episodes has been a major limiting factor for the application of <i>Reinforcement Learning</i> (RL) in robotics. Learning skills directly on real robots requires time, causes wear and tear and can lead to damage to the robot and environment due to unsafe exploratory actions. The success of learning skills in simulation and transferring them to real robots has also been limited by the gap between reality and simulation. This is particularly problematic for tasks involving contact with the environment as contact dynamics are hard to model and simulate. In this paper we propose a framework which leverages a shared control framework for modeling known constraints defined by object interactions and task geometry to reduce the state and action spaces and hence the overall dimensionality of the reinforcement learning problem. The unknown task knowledge and actions are learned by a reinforcement learning agent by conducting exploration in the constrained environment. Using a pouring task and grid-clamp placement task (similar to peg-in-hole) as use cases and a 7-DoF arm, we show that our approach can be used to learn directly on the real robot. The pouring task is learned in only 65 episodes (16 min) and the grid-clamp placement task is learned in 75 episodes (17 min) with strong safety guarantees and simple reward functions, greatly alleviating the need for simulation.</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"48 4-5","pages":""},"PeriodicalIF":3.7,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10514-024-10164-6.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141259479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Autonomous RobotsPub Date : 2024-06-04DOI: 10.1007/s10514-024-10167-3
Linda van der Spaa, Jens Kober, Michael Gienger
{"title":"Simultaneously learning intentions and preferences during physical human-robot cooperation","authors":"Linda van der Spaa, Jens Kober, Michael Gienger","doi":"10.1007/s10514-024-10167-3","DOIUrl":"10.1007/s10514-024-10167-3","url":null,"abstract":"<div><p>The advent of collaborative robots allows humans and robots to cooperate in a direct and physical way. While this leads to amazing new opportunities to create novel robotics applications, it is challenging to make the collaboration intuitive for the human. From a system’s perspective, understanding the human intentions seems to be one promising way to get there. However, human behavior exhibits large variations between individuals, such as for instance preferences or physical abilities. This paper presents a novel concept for simultaneously learning a model of the human intentions and preferences incrementally during collaboration with a robot. Starting out with a nominal model, the system acquires collaborative skills step-by-step within only very few trials. The concept is based on a combination of model-based reinforcement learning and inverse reinforcement learning, adapted to fit collaborations in which human and robot think and act independently. We test the method and compare it to two baselines: one that imitates the human and one that uses plain maximum entropy inverse reinforcement learning, both in simulation and in a user study with a Franka Emika Panda robot arm.</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"48 4-5","pages":""},"PeriodicalIF":3.7,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10514-024-10167-3.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141259728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Autonomous RobotsPub Date : 2024-05-24DOI: 10.1007/s10514-024-10165-5
Ouerghi Meriam, Hou Mengxue, Zhang Fumin
{"title":"Laplacian regularized motion tomography for underwater vehicle flow mapping with sporadic localization measurements","authors":"Ouerghi Meriam, Hou Mengxue, Zhang Fumin","doi":"10.1007/s10514-024-10165-5","DOIUrl":"10.1007/s10514-024-10165-5","url":null,"abstract":"<div><p>Localization measurements for an autonomous underwater vehicle (AUV) are often difficult to obtain. In many cases, localization measurements are only available sporadically after the AUV comes to the sea surface. Since the motion of AUVs is often affected by unknown underwater flow fields, the sporadic localization measurements carry information of the underwater flow field. Motion tomography (MT) algorithms have been developed to compute a underwater flow map based on the sporadic localization measurements. This paper extends MT by introducing Laplacian regularization in to the problem formulation and the MT algorithm. Laplacian regularization enforces smoothness in the spatial distribution of the underwater flow field. The resulted Laplacian regularized motion tomography (RMT) algorithm converges to achieve a finite error bounded. The performance of the RMT and other variants of MT are compared through the method of data resolution analysis. The improved performance of RMT is confirmed by experimental data collected from underwater glider ocean sensing experiments.\u0000</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"48 4-5","pages":""},"PeriodicalIF":3.7,"publicationDate":"2024-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141101854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Autonomous RobotsPub Date : 2024-05-16DOI: 10.1007/s10514-024-10159-3
Alessandra Rossi, Maike Paetzel-Prüsmann, Merel Keijsers, Michael Anderson, Susan Leigh Anderson, Daniel Barry, Jan Gutsche, Justin Hart, Luca Iocchi, Ainse Kokkelmans, Wouter Kuijpers, Yun Liu, Daniel Polani, Caleb Roscon, Marcus Scheunemann, Peter Stone, Florian Vahl, René van de Molengraft, Oskar von Stryk
{"title":"The human in the loop Perspectives and challenges for RoboCup 2050","authors":"Alessandra Rossi, Maike Paetzel-Prüsmann, Merel Keijsers, Michael Anderson, Susan Leigh Anderson, Daniel Barry, Jan Gutsche, Justin Hart, Luca Iocchi, Ainse Kokkelmans, Wouter Kuijpers, Yun Liu, Daniel Polani, Caleb Roscon, Marcus Scheunemann, Peter Stone, Florian Vahl, René van de Molengraft, Oskar von Stryk","doi":"10.1007/s10514-024-10159-3","DOIUrl":"10.1007/s10514-024-10159-3","url":null,"abstract":"<div><p>Robotics researchers have been focusing on developing autonomous and human-like intelligent robots that are able to plan, navigate, manipulate objects, and interact with humans in both static and dynamic environments. These capabilities, however, are usually developed for direct interactions with people in controlled environments, and evaluated primarily in terms of human safety. Consequently, human-robot interaction (HRI) in scenarios with no intervention of technical personnel is under-explored. However, in the future, robots will be deployed in unstructured and unsupervised environments where they will be expected to work unsupervised on tasks which require direct interaction with humans and may not necessarily be collaborative. Developing such robots requires comparing the effectiveness and efficiency of similar design approaches and techniques. Yet, issues regarding the reproducibility of results, comparing different approaches between research groups, and creating challenging milestones to measure performance and development over time make this difficult. Here we discuss the international robotics competition called RoboCup as a benchmark for the progress and open challenges in AI and robotics development. The long term goal of RoboCup is developing a robot soccer team that can win against the world’s best human soccer team by 2050. We selected RoboCup because it requires robots to be able to play with and against humans in unstructured environments, such as uneven fields and natural lighting conditions, and it challenges the known accepted dynamics in HRI. Considering the current state of robotics technology, RoboCup’s goal opens up several open research questions to be addressed by roboticists. In this paper, we (a) summarise the current challenges in robotics by using RoboCup development as an evaluation metric, (b) discuss the state-of-the-art approaches to these challenges and how they currently apply to RoboCup, and (c) present a path for future development in the given areas to meet RoboCup’s goal of having robots play soccer against and with humans by 2050.</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"48 2-3","pages":""},"PeriodicalIF":3.7,"publicationDate":"2024-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10514-024-10159-3.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141032933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Autonomous RobotsPub Date : 2024-05-03DOI: 10.1007/s10514-024-10161-9
{"title":"Editorial - Robotics: Science and Systems 2022","authors":"","doi":"10.1007/s10514-024-10161-9","DOIUrl":"10.1007/s10514-024-10161-9","url":null,"abstract":"","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"48 2-3","pages":""},"PeriodicalIF":3.7,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142408664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Autonomous RobotsPub Date : 2024-04-20DOI: 10.1007/s10514-024-10157-5
Marco Faroni, Nicola Pedrocchi, Manuel Beschi
{"title":"Adaptive hybrid local–global sampling for fast informed sampling-based optimal path planning","authors":"Marco Faroni, Nicola Pedrocchi, Manuel Beschi","doi":"10.1007/s10514-024-10157-5","DOIUrl":"10.1007/s10514-024-10157-5","url":null,"abstract":"<div><p>This paper improves the performance of RRT<span>(^*)</span>-like sampling-based path planners by combining admissible informed sampling and local sampling (i.e., sampling the neighborhood of the current solution). An adaptive strategy regulates the trade-off between exploration (admissible informed sampling) and exploitation (local sampling) based on online rewards from previous samples. The paper demonstrates that the algorithm is asymptotically optimal and has a better convergence rate than state-of-the-art path planners (e.g., Informed-RRT<span>(^*)</span>) in several simulated and real-world scenarios. An open-source, ROS-compatible implementation of the algorithm is publicly available.</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"48 2-3","pages":""},"PeriodicalIF":3.7,"publicationDate":"2024-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10514-024-10157-5.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140629716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Autonomous RobotsPub Date : 2024-04-17DOI: 10.1007/s10514-024-10160-w
Xiaoying Wang, Tong Zhang
{"title":"Reinforcement learning with imitative behaviors for humanoid robots navigation: synchronous planning and control","authors":"Xiaoying Wang, Tong Zhang","doi":"10.1007/s10514-024-10160-w","DOIUrl":"10.1007/s10514-024-10160-w","url":null,"abstract":"<div><p>Humanoid robots have strong adaptability to complex environments and possess human-like flexibility, enabling them to perform precise farming and harvesting tasks in varying depths of terrains. They serve as essential tools for agricultural intelligence. In this article, a novel method was proposed to improve the robustness of autonomous navigation for humanoid robots, which intercommunicates the data fusion of the footprint planning and control levels. In particular, a deep reinforcement learning model - Proximal Policy Optimization (PPO) that has been fine-tuned is introduced into this layer, before which heuristic trajectory was generated based on imitation learning. In the RL period, the KL divergence between the agent’s policy and imitative expert policy as a value penalty is added to the advantage function. As a proof of concept, our navigation policy is trained in a robotic simulator and then successfully applied to the physical robot <i>GTX</i> for indoor multi-mode navigation. The experimental results conclude that incorporating imitation learning imparts anthropomorphic attributes to robots and facilitates the generation of seamless footstep patterns. There is a significant improvement in ZMP trajectory in y-direction from the center by 21.56% is noticed. Additionally, this method improves dynamic locomotion stability, the body attitude angle falling between less than ± 5.5<span>(^circ )</span> compared to ± 48.4<span>(^circ )</span> with traditional algorithm. In general, navigation error is below 5 cm, which we verified in the experiments. It is thought that the outcome of the proposed framework presented in this article can provide a reference for researchers studying autonomous navigation applications of humanoid robots on uneven ground.\u0000</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"48 2-3","pages":""},"PeriodicalIF":3.7,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140608698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Autonomous RobotsPub Date : 2024-03-30DOI: 10.1007/s10514-024-10158-4
Giuseppe Vecchio, Simone Palazzo, Dario C. Guastella, Daniela Giordano, Giovanni Muscato, Concetto Spampinato
{"title":"Terrain traversability prediction through self-supervised learning and unsupervised domain adaptation on synthetic data","authors":"Giuseppe Vecchio, Simone Palazzo, Dario C. Guastella, Daniela Giordano, Giovanni Muscato, Concetto Spampinato","doi":"10.1007/s10514-024-10158-4","DOIUrl":"10.1007/s10514-024-10158-4","url":null,"abstract":"<div><p>Terrain traversability estimation is a fundamental task for supporting robot navigation on uneven surfaces. Recent learning-based approaches for predicting traversability from RGB images have shown promising results, but require manual annotation of a large number of images for training. To address this limitation, we present a method for traversability estimation on unlabeled videos that combines dataset synthesis, self-supervision and unsupervised domain adaptation. We pose the traversability estimation as a vector regression task over vertical bands of the observed frame. The model is pre-trained through self-supervision to reduce the distribution shift between synthetic and real data and encourage shared feature learning. Then, supervised training on synthetic videos is carried out, while employing an unsupervised domain adaptation loss to improve its generalization capabilities on real scenes. Experimental results show that our approach is on par with standard supervised training, and effectively supports robot navigation without the need of manual annotations. Training code and synthetic dataset will be publicly released at: https://github.com/perceivelab/traversability-synth.</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"48 2-3","pages":""},"PeriodicalIF":3.7,"publicationDate":"2024-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10514-024-10158-4.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140364755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Autonomous RobotsPub Date : 2024-01-30DOI: 10.1007/s10514-024-10156-6
Pao-Te Lin, Kuo-Shih Tseng
{"title":"Maximal coverage problems with routing constraints using cross-entropy Monte Carlo tree search","authors":"Pao-Te Lin, Kuo-Shih Tseng","doi":"10.1007/s10514-024-10156-6","DOIUrl":"10.1007/s10514-024-10156-6","url":null,"abstract":"<div><p>Spatial search, and environmental monitoring are key technologies in robotics. These problems can be reformulated as maximal coverage problems with routing constraints, which are NP-hard problems. The generalized cost-benefit algorithm (GCB) can solve these problems with theoretical guarantees. To achieve better performance, evolutionary algorithms (EA) boost its performance via more samples. However, it is hard to know the terminal conditions of EA to outperform GCB. To solve these problems with theoretical guarantees and terminal conditions, in this research, the cross-entropy based Monte Carlo Tree Search algorithm (CE-MCTS) is proposed. It consists of three parts: the EA for sampling the branches, the upper confidence bound policy for selections, and the estimation of distribution algorithm for simulations. The experiments demonstrate that the CE-MCTS outperforms benchmark approaches (e.g., GCB, EAMC) in spatial search problems.\u0000</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"48 1","pages":""},"PeriodicalIF":3.7,"publicationDate":"2024-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139646697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}