{"title":"FPGA hardware implementation of Q-learning algorithm with low resource consumption","authors":"Xiaojuan Liu, Jietao Diao, Nan Li","doi":"10.1145/3549179.3549181","DOIUrl":null,"url":null,"abstract":"Q-learning is a kind of reinforcement learning, having a wide range of applications varying in different fields. However, in some circumstances like robot control which has shorter training time requirement, Q-learning algorithm implemented on GPU or CPU may not meet the requirement. In this paper, we proposed a novel serial acceleration architecture for Q-learning algorithm and implemented the architecture on xczu7ev-ffvc1156 FPGA using Vivado 2019.1 development environment. As a result, the resource consumption is reduced by about 50% compared with the architecture proposed in [1],and the update cycle of Q-learning algorithm is fixed to 4 clock cycles.","PeriodicalId":105724,"journal":{"name":"Proceedings of the 2022 International Conference on Pattern Recognition and Intelligent Systems","volume":"58 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 International Conference on Pattern Recognition and Intelligent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3549179.3549181","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Q-learning is a kind of reinforcement learning, having a wide range of applications varying in different fields. However, in some circumstances like robot control which has shorter training time requirement, Q-learning algorithm implemented on GPU or CPU may not meet the requirement. In this paper, we proposed a novel serial acceleration architecture for Q-learning algorithm and implemented the architecture on xczu7ev-ffvc1156 FPGA using Vivado 2019.1 development environment. As a result, the resource consumption is reduced by about 50% compared with the architecture proposed in [1],and the update cycle of Q-learning algorithm is fixed to 4 clock cycles.