{"title":"空中机械臂持续自主的自反思学习策略","authors":"Xu Zhou, Jiucai Zhang, Xiaoli Zhang","doi":"10.1115/dscc2019-9086","DOIUrl":null,"url":null,"abstract":"\n Autonomous aerial manipulators have great potentials to assist humans or even fully automate manual labor-intensive tasks such as aerial cleaning, aerial transportation, infrastructure repair, and agricultural inspection and sampling. Reinforcement learning holds the promise of enabling persistent autonomy of aerial manipulators because it can adapt to different situations by automatically learning optimal policies from the interactions between the aerial manipulator and environments. However, the learning process itself could experience failures that can practically endanger the safety of aerial manipulators and hence hinder persistent autonomy. In order to solve this problem, we propose for the aerial manipulator a self-reflective learning strategy that can smartly and safely finding optimal policies for different new situations. This self-reflective manner consists of three steps: identifying the appearance of new situations, re-seeking the optimal policy with reinforcement learning, and evaluating the termination of self-reflection. Numerical simulations demonstrate, compared with conventional learning-based autonomy, our strategy can significantly reduce failures while still can finish the given task.","PeriodicalId":41412,"journal":{"name":"Mechatronic Systems and Control","volume":null,"pages":null},"PeriodicalIF":1.0000,"publicationDate":"2019-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Self-Reflective Learning Strategy for Persistent Autonomy of Aerial Manipulators\",\"authors\":\"Xu Zhou, Jiucai Zhang, Xiaoli Zhang\",\"doi\":\"10.1115/dscc2019-9086\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n Autonomous aerial manipulators have great potentials to assist humans or even fully automate manual labor-intensive tasks such as aerial cleaning, aerial transportation, infrastructure repair, and agricultural inspection and sampling. Reinforcement learning holds the promise of enabling persistent autonomy of aerial manipulators because it can adapt to different situations by automatically learning optimal policies from the interactions between the aerial manipulator and environments. However, the learning process itself could experience failures that can practically endanger the safety of aerial manipulators and hence hinder persistent autonomy. In order to solve this problem, we propose for the aerial manipulator a self-reflective learning strategy that can smartly and safely finding optimal policies for different new situations. This self-reflective manner consists of three steps: identifying the appearance of new situations, re-seeking the optimal policy with reinforcement learning, and evaluating the termination of self-reflection. Numerical simulations demonstrate, compared with conventional learning-based autonomy, our strategy can significantly reduce failures while still can finish the given task.\",\"PeriodicalId\":41412,\"journal\":{\"name\":\"Mechatronic Systems and Control\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2019-11-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Mechatronic Systems and Control\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1115/dscc2019-9086\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mechatronic Systems and Control","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1115/dscc2019-9086","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
Self-Reflective Learning Strategy for Persistent Autonomy of Aerial Manipulators
Autonomous aerial manipulators have great potentials to assist humans or even fully automate manual labor-intensive tasks such as aerial cleaning, aerial transportation, infrastructure repair, and agricultural inspection and sampling. Reinforcement learning holds the promise of enabling persistent autonomy of aerial manipulators because it can adapt to different situations by automatically learning optimal policies from the interactions between the aerial manipulator and environments. However, the learning process itself could experience failures that can practically endanger the safety of aerial manipulators and hence hinder persistent autonomy. In order to solve this problem, we propose for the aerial manipulator a self-reflective learning strategy that can smartly and safely finding optimal policies for different new situations. This self-reflective manner consists of three steps: identifying the appearance of new situations, re-seeking the optimal policy with reinforcement learning, and evaluating the termination of self-reflection. Numerical simulations demonstrate, compared with conventional learning-based autonomy, our strategy can significantly reduce failures while still can finish the given task.
期刊介绍:
This international journal publishes both theoretical and application-oriented papers on various aspects of mechatronic systems, modelling, design, conventional and intelligent control, and intelligent systems. Application areas of mechatronics may include robotics, transportation, energy systems, manufacturing, sensors, actuators, and automation. Techniques of artificial intelligence may include soft computing (fuzzy logic, neural networks, genetic algorithms/evolutionary computing, probabilistic methods, etc.). Techniques may cover frequency and time domains, linear and nonlinear systems, and deterministic and stochastic processes. Hybrid techniques of mechatronics that combine conventional and intelligent methods are also included. First published in 1972, this journal originated with an emphasis on conventional control systems and computer-based applications. Subsequently, with rapid advances in the field and in view of the widespread interest and application of soft computing in control systems, this latter aspect was integrated into the journal. Now the area of mechatronics is included as the main focus. A unique feature of the journal is its pioneering role in bridging the gap between conventional systems and intelligent systems, with an equal emphasis on theory and practical applications, including system modelling, design and instrumentation. It appears four times per year.