{"title":"连续控制的强化学习:量子归一化优势函数方法","authors":"Yaofu Liu, Chang Xu, Siyuan Jin","doi":"10.1109/QSW59989.2023.00020","DOIUrl":null,"url":null,"abstract":"In this study, we present a new approach to quantum reinforcement learning that can handle tasks with a range of continuous actions. Our method uses a quantum version of the classic normalized advantage function (QNAF), only needing the Q-value network created by a quantum neural network and avoiding any policy network. We implemented the method by TensorFlow framework. When tested against standard Gym benchmarks, QNAF outperforms classical NAF and prior quantum methods in terms of fewer adjustable parameters. Furthermore, it shows improved stability, reliably converging regardless of changes in initial random parameters.","PeriodicalId":254476,"journal":{"name":"2023 IEEE International Conference on Quantum Software (QSW)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reinforcement Learning for Continuous Control: A Quantum Normalized Advantage Function Approach\",\"authors\":\"Yaofu Liu, Chang Xu, Siyuan Jin\",\"doi\":\"10.1109/QSW59989.2023.00020\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this study, we present a new approach to quantum reinforcement learning that can handle tasks with a range of continuous actions. Our method uses a quantum version of the classic normalized advantage function (QNAF), only needing the Q-value network created by a quantum neural network and avoiding any policy network. We implemented the method by TensorFlow framework. When tested against standard Gym benchmarks, QNAF outperforms classical NAF and prior quantum methods in terms of fewer adjustable parameters. Furthermore, it shows improved stability, reliably converging regardless of changes in initial random parameters.\",\"PeriodicalId\":254476,\"journal\":{\"name\":\"2023 IEEE International Conference on Quantum Software (QSW)\",\"volume\":\"35 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE International Conference on Quantum Software (QSW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/QSW59989.2023.00020\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Quantum Software (QSW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/QSW59989.2023.00020","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Reinforcement Learning for Continuous Control: A Quantum Normalized Advantage Function Approach
In this study, we present a new approach to quantum reinforcement learning that can handle tasks with a range of continuous actions. Our method uses a quantum version of the classic normalized advantage function (QNAF), only needing the Q-value network created by a quantum neural network and avoiding any policy network. We implemented the method by TensorFlow framework. When tested against standard Gym benchmarks, QNAF outperforms classical NAF and prior quantum methods in terms of fewer adjustable parameters. Furthermore, it shows improved stability, reliably converging regardless of changes in initial random parameters.