{"title":"Curiosity-Driven Exploration Effectiveness on Various Environments","authors":"K. Charoenpitaks, Y. Limpiyakorn","doi":"10.1145/3387168.3387235","DOIUrl":null,"url":null,"abstract":"Hand Crafting Reward functions have never been scalable solutions for real world problems. The self-generated intrinsic rewards inspired by human curiosity may be one of scalable answers to solve sparse reward problem. The research thus investigated the effectiveness of some selected techniques based on the theory of curiosity-driven exploration. The Count-based, Prediction-based and other methods in total of six algorithms were experimented on various OpenAI gym environments. The results showed that the exploration algorithms have an impact on software agent in ability to find optimal solutions compared with the baseline in many cases. Still, there is no clear winner between the selected exploration methods and the best scalable exploration is not yet explored. The finding is that the added small intrinsic reward noise helps improve sample efficiency in the short run.","PeriodicalId":346739,"journal":{"name":"Proceedings of the 3rd International Conference on Vision, Image and Signal Processing","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 3rd International Conference on Vision, Image and Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3387168.3387235","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Hand Crafting Reward functions have never been scalable solutions for real world problems. The self-generated intrinsic rewards inspired by human curiosity may be one of scalable answers to solve sparse reward problem. The research thus investigated the effectiveness of some selected techniques based on the theory of curiosity-driven exploration. The Count-based, Prediction-based and other methods in total of six algorithms were experimented on various OpenAI gym environments. The results showed that the exploration algorithms have an impact on software agent in ability to find optimal solutions compared with the baseline in many cases. Still, there is no clear winner between the selected exploration methods and the best scalable exploration is not yet explored. The finding is that the added small intrinsic reward noise helps improve sample efficiency in the short run.