{"title":"AutoLfD:关闭从演示中学习的循环","authors":"Shaokang Wu;Yijin Wang;Yanlong Huang","doi":"10.1109/TASE.2025.3532820","DOIUrl":null,"url":null,"abstract":"Over the past few years, there have been numerous works towards advancing the generalization capability of robots, among which learning from demonstrations (LfD) has drawn much attention by virtue of its user-friendly and data-efficient nature. While many LfD solutions have been reported, a key question has not been properly addressed: how can we evaluate the generalization performance of LfD? For instance, when a robot draws a letter that needs to pass through new desired points, how does it ensure the new trajectory maintains a similar shape to the demonstration? This question becomes more relevant when a new task is significantly far from the demonstrated region. To tackle this issue, a user often resorts to manual tuning of the hyperparameters of an LfD approach until a satisfactory trajectory is attained. In this paper, we aim to provide closed-loop evaluative feedback for LfD and optimize LfD in an automatic fashion. Specifically, we consider dynamical movement primitives (DMP) and kernelized movement primitives (KMP) as examples and develop a generic optimization framework capable of measuring the generalization performance of DMP and KMP and auto-optimizing their hyperparameters. Evaluations including peg-in-hole, block-stacking and pushing tasks on a real robot evidence the applicability of our framework. Note to Practitioners—The paper is motivated by the demand to transfer human skills to robots. While the problems of ‘what to learn’ and ‘how to learn’ have been long-standing research topics, the solutions for evaluating the quality of such skill transfer remain largely open. We introduce a novel closed-loop framework towards transferring human skills to robots in an automatic manner. Specifically, we collect a training dataset that reflects user preference for trajectory adaptation and train a trajectory encoder network using the dataset. With the encoder network, we design a robust metric to measure the skill transfer quality and subsequently employ the metric to guide imitation learning of human skills. By using our framework, unseen robotic tasks can be tackled by adapting the demonstrations straightforwardly, where relevant hyperparameters involved in skill transfer are optimized automatically.","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"22 ","pages":"11124-11138"},"PeriodicalIF":6.4000,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"AutoLfD: Closing the Loop for Learning From Demonstrations\",\"authors\":\"Shaokang Wu;Yijin Wang;Yanlong Huang\",\"doi\":\"10.1109/TASE.2025.3532820\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Over the past few years, there have been numerous works towards advancing the generalization capability of robots, among which learning from demonstrations (LfD) has drawn much attention by virtue of its user-friendly and data-efficient nature. While many LfD solutions have been reported, a key question has not been properly addressed: how can we evaluate the generalization performance of LfD? For instance, when a robot draws a letter that needs to pass through new desired points, how does it ensure the new trajectory maintains a similar shape to the demonstration? This question becomes more relevant when a new task is significantly far from the demonstrated region. To tackle this issue, a user often resorts to manual tuning of the hyperparameters of an LfD approach until a satisfactory trajectory is attained. In this paper, we aim to provide closed-loop evaluative feedback for LfD and optimize LfD in an automatic fashion. Specifically, we consider dynamical movement primitives (DMP) and kernelized movement primitives (KMP) as examples and develop a generic optimization framework capable of measuring the generalization performance of DMP and KMP and auto-optimizing their hyperparameters. Evaluations including peg-in-hole, block-stacking and pushing tasks on a real robot evidence the applicability of our framework. Note to Practitioners—The paper is motivated by the demand to transfer human skills to robots. While the problems of ‘what to learn’ and ‘how to learn’ have been long-standing research topics, the solutions for evaluating the quality of such skill transfer remain largely open. We introduce a novel closed-loop framework towards transferring human skills to robots in an automatic manner. Specifically, we collect a training dataset that reflects user preference for trajectory adaptation and train a trajectory encoder network using the dataset. With the encoder network, we design a robust metric to measure the skill transfer quality and subsequently employ the metric to guide imitation learning of human skills. By using our framework, unseen robotic tasks can be tackled by adapting the demonstrations straightforwardly, where relevant hyperparameters involved in skill transfer are optimized automatically.\",\"PeriodicalId\":51060,\"journal\":{\"name\":\"IEEE Transactions on Automation Science and Engineering\",\"volume\":\"22 \",\"pages\":\"11124-11138\"},\"PeriodicalIF\":6.4000,\"publicationDate\":\"2025-01-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Automation Science and Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10849584/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automation Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10849584/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
AutoLfD: Closing the Loop for Learning From Demonstrations
Over the past few years, there have been numerous works towards advancing the generalization capability of robots, among which learning from demonstrations (LfD) has drawn much attention by virtue of its user-friendly and data-efficient nature. While many LfD solutions have been reported, a key question has not been properly addressed: how can we evaluate the generalization performance of LfD? For instance, when a robot draws a letter that needs to pass through new desired points, how does it ensure the new trajectory maintains a similar shape to the demonstration? This question becomes more relevant when a new task is significantly far from the demonstrated region. To tackle this issue, a user often resorts to manual tuning of the hyperparameters of an LfD approach until a satisfactory trajectory is attained. In this paper, we aim to provide closed-loop evaluative feedback for LfD and optimize LfD in an automatic fashion. Specifically, we consider dynamical movement primitives (DMP) and kernelized movement primitives (KMP) as examples and develop a generic optimization framework capable of measuring the generalization performance of DMP and KMP and auto-optimizing their hyperparameters. Evaluations including peg-in-hole, block-stacking and pushing tasks on a real robot evidence the applicability of our framework. Note to Practitioners—The paper is motivated by the demand to transfer human skills to robots. While the problems of ‘what to learn’ and ‘how to learn’ have been long-standing research topics, the solutions for evaluating the quality of such skill transfer remain largely open. We introduce a novel closed-loop framework towards transferring human skills to robots in an automatic manner. Specifically, we collect a training dataset that reflects user preference for trajectory adaptation and train a trajectory encoder network using the dataset. With the encoder network, we design a robust metric to measure the skill transfer quality and subsequently employ the metric to guide imitation learning of human skills. By using our framework, unseen robotic tasks can be tackled by adapting the demonstrations straightforwardly, where relevant hyperparameters involved in skill transfer are optimized automatically.
期刊介绍:
The IEEE Transactions on Automation Science and Engineering (T-ASE) publishes fundamental papers on Automation, emphasizing scientific results that advance efficiency, quality, productivity, and reliability. T-ASE encourages interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, operations research, and other fields. T-ASE welcomes results relevant to industries such as agriculture, biotechnology, healthcare, home automation, maintenance, manufacturing, pharmaceuticals, retail, security, service, supply chains, and transportation. T-ASE addresses a research community willing to integrate knowledge across disciplines and industries. For this purpose, each paper includes a Note to Practitioners that summarizes how its results can be applied or how they might be extended to apply in practice.