Jared Town;Zachary Morrison;Rushikesh Kamalapurkar
{"title":"通过基于观测器的反强化学习建立飞行员性能模型","authors":"Jared Town;Zachary Morrison;Rushikesh Kamalapurkar","doi":"10.1109/TCST.2024.3410128","DOIUrl":null,"url":null,"abstract":"The focus of this brief is behavior modeling for pilots of unmanned aerial systems. The pilot is assumed to make decisions that optimize an unknown cost functional. The cost functional is estimated from observed trajectories using a novel inverse reinforcement learning (IRL) framework. The resulting IRL problem often admits multiple solutions. In this brief, a recently developed IRL observer is adapted to the pilot behavior modeling problem. The observer is shown to converge to one of the equivalent solutions of the corresponding IRL problem. The developed technique is implemented on a quadcopter where the pilot is a surrogate linear-quadratic controller that generates velocity commands for set-point regulation of the quadcopter. Experimental results demonstrate the ability of the developed method to learn equivalent cost functionals.","PeriodicalId":13103,"journal":{"name":"IEEE Transactions on Control Systems Technology","volume":"32 6","pages":"2444-2451"},"PeriodicalIF":4.9000,"publicationDate":"2024-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Pilot Performance Modeling via Observer-Based Inverse Reinforcement Learning\",\"authors\":\"Jared Town;Zachary Morrison;Rushikesh Kamalapurkar\",\"doi\":\"10.1109/TCST.2024.3410128\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The focus of this brief is behavior modeling for pilots of unmanned aerial systems. The pilot is assumed to make decisions that optimize an unknown cost functional. The cost functional is estimated from observed trajectories using a novel inverse reinforcement learning (IRL) framework. The resulting IRL problem often admits multiple solutions. In this brief, a recently developed IRL observer is adapted to the pilot behavior modeling problem. The observer is shown to converge to one of the equivalent solutions of the corresponding IRL problem. The developed technique is implemented on a quadcopter where the pilot is a surrogate linear-quadratic controller that generates velocity commands for set-point regulation of the quadcopter. Experimental results demonstrate the ability of the developed method to learn equivalent cost functionals.\",\"PeriodicalId\":13103,\"journal\":{\"name\":\"IEEE Transactions on Control Systems Technology\",\"volume\":\"32 6\",\"pages\":\"2444-2451\"},\"PeriodicalIF\":4.9000,\"publicationDate\":\"2024-06-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Control Systems Technology\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10561612/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Control Systems Technology","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10561612/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
Pilot Performance Modeling via Observer-Based Inverse Reinforcement Learning
The focus of this brief is behavior modeling for pilots of unmanned aerial systems. The pilot is assumed to make decisions that optimize an unknown cost functional. The cost functional is estimated from observed trajectories using a novel inverse reinforcement learning (IRL) framework. The resulting IRL problem often admits multiple solutions. In this brief, a recently developed IRL observer is adapted to the pilot behavior modeling problem. The observer is shown to converge to one of the equivalent solutions of the corresponding IRL problem. The developed technique is implemented on a quadcopter where the pilot is a surrogate linear-quadratic controller that generates velocity commands for set-point regulation of the quadcopter. Experimental results demonstrate the ability of the developed method to learn equivalent cost functionals.
期刊介绍:
The IEEE Transactions on Control Systems Technology publishes high quality technical papers on technological advances in control engineering. The word technology is from the Greek technologia. The modern meaning is a scientific method to achieve a practical purpose. Control Systems Technology includes all aspects of control engineering needed to implement practical control systems, from analysis and design, through simulation and hardware. A primary purpose of the IEEE Transactions on Control Systems Technology is to have an archival publication which will bridge the gap between theory and practice. Papers are published in the IEEE Transactions on Control System Technology which disclose significant new knowledge, exploratory developments, or practical applications in all aspects of technology needed to implement control systems, from analysis and design through simulation, and hardware.