{"title":"强化学习中的物理内在奖励","authors":"Jiazhou Jiang, M. Fu, Zhiyong Chen","doi":"10.1109/ANZCC56036.2022.9966956","DOIUrl":null,"url":null,"abstract":"Model-free algorithms in Reinforcement Learning (RL) are known to be a powerful learning tool and have performed well in solving complex issues. However, RL training results are often poor when the reward function is sparse or misleading in short term. In this paper, we propose a physics informed intrinsic reward function to assist the agent to overcome this difficulty. We evaluate the proposed intrinsic reward method on different types of actor-critic (AC) algorithms. The experimental results show noticeable improvement.","PeriodicalId":190548,"journal":{"name":"2022 Australian & New Zealand Control Conference (ANZCC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Physics Informed Intrinsic Rewards in Reinforcement Learning\",\"authors\":\"Jiazhou Jiang, M. Fu, Zhiyong Chen\",\"doi\":\"10.1109/ANZCC56036.2022.9966956\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Model-free algorithms in Reinforcement Learning (RL) are known to be a powerful learning tool and have performed well in solving complex issues. However, RL training results are often poor when the reward function is sparse or misleading in short term. In this paper, we propose a physics informed intrinsic reward function to assist the agent to overcome this difficulty. We evaluate the proposed intrinsic reward method on different types of actor-critic (AC) algorithms. The experimental results show noticeable improvement.\",\"PeriodicalId\":190548,\"journal\":{\"name\":\"2022 Australian & New Zealand Control Conference (ANZCC)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 Australian & New Zealand Control Conference (ANZCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ANZCC56036.2022.9966956\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 Australian & New Zealand Control Conference (ANZCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ANZCC56036.2022.9966956","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Physics Informed Intrinsic Rewards in Reinforcement Learning
Model-free algorithms in Reinforcement Learning (RL) are known to be a powerful learning tool and have performed well in solving complex issues. However, RL training results are often poor when the reward function is sparse or misleading in short term. In this paper, we propose a physics informed intrinsic reward function to assist the agent to overcome this difficulty. We evaluate the proposed intrinsic reward method on different types of actor-critic (AC) algorithms. The experimental results show noticeable improvement.