{"title":"课程强化学习的博弈论方法","authors":"M. Smyrnakis, Lan Hoang","doi":"10.1109/ICTAI56018.2022.00184","DOIUrl":null,"url":null,"abstract":"Current reinforcement learning automated curricu-lum approaches continual learning by updating the environment. The update is often treated as an optimisation problem - with the teacher agent updating the environment to optimise the student's learning. This work proposes an alternative framing of the problem using a game-theoretic formulation. The learning is defined by a leader - follower cooperative game. This formulation provides an approach for multi-agent curriculum learning that improves agent learning and provides more game equilibrium insights. We observed that under this framework, the agents converge faster to perform on the desired outcomes, compared to the reinforcement learning agent baseline.","PeriodicalId":354314,"journal":{"name":"2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A game theoretic approach to curriculum reinforcement learning\",\"authors\":\"M. Smyrnakis, Lan Hoang\",\"doi\":\"10.1109/ICTAI56018.2022.00184\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Current reinforcement learning automated curricu-lum approaches continual learning by updating the environment. The update is often treated as an optimisation problem - with the teacher agent updating the environment to optimise the student's learning. This work proposes an alternative framing of the problem using a game-theoretic formulation. The learning is defined by a leader - follower cooperative game. This formulation provides an approach for multi-agent curriculum learning that improves agent learning and provides more game equilibrium insights. We observed that under this framework, the agents converge faster to perform on the desired outcomes, compared to the reinforcement learning agent baseline.\",\"PeriodicalId\":354314,\"journal\":{\"name\":\"2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI)\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICTAI56018.2022.00184\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTAI56018.2022.00184","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A game theoretic approach to curriculum reinforcement learning
Current reinforcement learning automated curricu-lum approaches continual learning by updating the environment. The update is often treated as an optimisation problem - with the teacher agent updating the environment to optimise the student's learning. This work proposes an alternative framing of the problem using a game-theoretic formulation. The learning is defined by a leader - follower cooperative game. This formulation provides an approach for multi-agent curriculum learning that improves agent learning and provides more game equilibrium insights. We observed that under this framework, the agents converge faster to perform on the desired outcomes, compared to the reinforcement learning agent baseline.