{"title":"基于TD(λ)和Q-learning的Ludo播放器","authors":"Majed Alhajry, Faisal Alvi, Moataz A. Ahmed","doi":"10.1109/CIG.2012.6374142","DOIUrl":null,"url":null,"abstract":"Reinforcement learning is a popular machine learning technique whose inherent self-learning ability has made it the candidate of choice for game AI. In this work we propose an expert player based by further enhancing our proposed basic strategies on Ludo. We then implement a TD(λ)based Ludo player and use our expert player to train this player. We also implement a Q-learning based Ludo player using the knowledge obtained from building the expert player. Our results show that while our TD(λ) and Q-Learning based Ludo players outperform the expert player, they do so only slightly suggesting that our expert player is a tough opponent. Further improvements to our RL players may lead to the eventual development of a near-optimal player for Ludo.","PeriodicalId":288052,"journal":{"name":"2012 IEEE Conference on Computational Intelligence and Games (CIG)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"TD(λ) and Q-learning based Ludo players\",\"authors\":\"Majed Alhajry, Faisal Alvi, Moataz A. Ahmed\",\"doi\":\"10.1109/CIG.2012.6374142\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Reinforcement learning is a popular machine learning technique whose inherent self-learning ability has made it the candidate of choice for game AI. In this work we propose an expert player based by further enhancing our proposed basic strategies on Ludo. We then implement a TD(λ)based Ludo player and use our expert player to train this player. We also implement a Q-learning based Ludo player using the knowledge obtained from building the expert player. Our results show that while our TD(λ) and Q-Learning based Ludo players outperform the expert player, they do so only slightly suggesting that our expert player is a tough opponent. Further improvements to our RL players may lead to the eventual development of a near-optimal player for Ludo.\",\"PeriodicalId\":288052,\"journal\":{\"name\":\"2012 IEEE Conference on Computational Intelligence and Games (CIG)\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE Conference on Computational Intelligence and Games (CIG)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CIG.2012.6374142\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE Conference on Computational Intelligence and Games (CIG)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIG.2012.6374142","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Reinforcement learning is a popular machine learning technique whose inherent self-learning ability has made it the candidate of choice for game AI. In this work we propose an expert player based by further enhancing our proposed basic strategies on Ludo. We then implement a TD(λ)based Ludo player and use our expert player to train this player. We also implement a Q-learning based Ludo player using the knowledge obtained from building the expert player. Our results show that while our TD(λ) and Q-Learning based Ludo players outperform the expert player, they do so only slightly suggesting that our expert player is a tough opponent. Further improvements to our RL players may lead to the eventual development of a near-optimal player for Ludo.