Yixian Zhang;Zhuoxuan Li;Yiding Cao;Xuan Zhao;Jinde Cao
{"title":"在 EWN 中使用优化蒙特卡洛树搜索进行深度强化学习","authors":"Yixian Zhang;Zhuoxuan Li;Yiding Cao;Xuan Zhao;Jinde Cao","doi":"10.1109/TG.2023.3308898","DOIUrl":null,"url":null,"abstract":"<italic>EinStein würfelt nicht!</i>\n (EWN) is a perfect information stochastic game, in which randomness influences the game process enormously. In this article, we propose an optimized algorithm named Quick Neural Network Tree Search (QNNTS) based on deep reinforcement learning and Monte Carlo tree search (MCTS) to construct the artificial intelligence agent of EWN. Meanwhile, the lightness of the model makes it possible to train with much less computing resources. The optimization structure of the algorithm based on MCTS is named Optimized Upper Confidence Bound Applied to Tree with Heuristic Search, which introduces the expectation valuation strategy into the MCTS. As the prerequisite product of QNNTS, it performs with an improvement of the winning rate. Ultimately, the Attention-ResNet structure combined with domain knowledge is used to obtain the proposed algorithm. Compared with several conventional algorithms, it gains high winning rates of at least 68%.","PeriodicalId":55977,"journal":{"name":"IEEE Transactions on Games","volume":"16 3","pages":"544-555"},"PeriodicalIF":1.7000,"publicationDate":"2023-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep Reinforcement Learning Using Optimized Monte Carlo Tree Search in EWN\",\"authors\":\"Yixian Zhang;Zhuoxuan Li;Yiding Cao;Xuan Zhao;Jinde Cao\",\"doi\":\"10.1109/TG.2023.3308898\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<italic>EinStein würfelt nicht!</i>\\n (EWN) is a perfect information stochastic game, in which randomness influences the game process enormously. In this article, we propose an optimized algorithm named Quick Neural Network Tree Search (QNNTS) based on deep reinforcement learning and Monte Carlo tree search (MCTS) to construct the artificial intelligence agent of EWN. Meanwhile, the lightness of the model makes it possible to train with much less computing resources. The optimization structure of the algorithm based on MCTS is named Optimized Upper Confidence Bound Applied to Tree with Heuristic Search, which introduces the expectation valuation strategy into the MCTS. As the prerequisite product of QNNTS, it performs with an improvement of the winning rate. Ultimately, the Attention-ResNet structure combined with domain knowledge is used to obtain the proposed algorithm. Compared with several conventional algorithms, it gains high winning rates of at least 68%.\",\"PeriodicalId\":55977,\"journal\":{\"name\":\"IEEE Transactions on Games\",\"volume\":\"16 3\",\"pages\":\"544-555\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2023-08-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Games\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10232894/\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Games","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10232894/","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Deep Reinforcement Learning Using Optimized Monte Carlo Tree Search in EWN
EinStein würfelt nicht!
(EWN) is a perfect information stochastic game, in which randomness influences the game process enormously. In this article, we propose an optimized algorithm named Quick Neural Network Tree Search (QNNTS) based on deep reinforcement learning and Monte Carlo tree search (MCTS) to construct the artificial intelligence agent of EWN. Meanwhile, the lightness of the model makes it possible to train with much less computing resources. The optimization structure of the algorithm based on MCTS is named Optimized Upper Confidence Bound Applied to Tree with Heuristic Search, which introduces the expectation valuation strategy into the MCTS. As the prerequisite product of QNNTS, it performs with an improvement of the winning rate. Ultimately, the Attention-ResNet structure combined with domain knowledge is used to obtain the proposed algorithm. Compared with several conventional algorithms, it gains high winning rates of at least 68%.