连续控制的强化学习:量子归一化优势函数方法

2023 IEEE International Conference on Quantum Software (QSW) Pub Date : 2023-07-01 DOI:10.1109/QSW59989.2023.00020

Yaofu Liu, Chang Xu, Siyuan Jin

{"title":"连续控制的强化学习:量子归一化优势函数方法","authors":"Yaofu Liu, Chang Xu, Siyuan Jin","doi":"10.1109/QSW59989.2023.00020","DOIUrl":null,"url":null,"abstract":"In this study, we present a new approach to quantum reinforcement learning that can handle tasks with a range of continuous actions. Our method uses a quantum version of the classic normalized advantage function (QNAF), only needing the Q-value network created by a quantum neural network and avoiding any policy network. We implemented the method by TensorFlow framework. When tested against standard Gym benchmarks, QNAF outperforms classical NAF and prior quantum methods in terms of fewer adjustable parameters. Furthermore, it shows improved stability, reliably converging regardless of changes in initial random parameters.","PeriodicalId":254476,"journal":{"name":"2023 IEEE International Conference on Quantum Software (QSW)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reinforcement Learning for Continuous Control: A Quantum Normalized Advantage Function Approach\",\"authors\":\"Yaofu Liu, Chang Xu, Siyuan Jin\",\"doi\":\"10.1109/QSW59989.2023.00020\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this study, we present a new approach to quantum reinforcement learning that can handle tasks with a range of continuous actions. Our method uses a quantum version of the classic normalized advantage function (QNAF), only needing the Q-value network created by a quantum neural network and avoiding any policy network. We implemented the method by TensorFlow framework. When tested against standard Gym benchmarks, QNAF outperforms classical NAF and prior quantum methods in terms of fewer adjustable parameters. Furthermore, it shows improved stability, reliably converging regardless of changes in initial random parameters.\",\"PeriodicalId\":254476,\"journal\":{\"name\":\"2023 IEEE International Conference on Quantum Software (QSW)\",\"volume\":\"35 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE International Conference on Quantum Software (QSW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/QSW59989.2023.00020\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Quantum Software (QSW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/QSW59989.2023.00020","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在这项研究中，我们提出了一种新的量子强化学习方法，可以处理一系列连续动作的任务。我们的方法使用经典归一化优势函数(QNAF)的量子版本，只需要量子神经网络创建的q值网络，避免了任何策略网络。我们通过TensorFlow框架实现了该方法。在针对标准Gym基准进行测试时，QNAF在可调参数较少方面优于经典NAF和先前的量子方法。此外，无论初始随机参数如何变化，该方法都具有较好的稳定性和可靠的收敛性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Reinforcement Learning for Continuous Control: A Quantum Normalized Advantage Function Approach

In this study, we present a new approach to quantum reinforcement learning that can handle tasks with a range of continuous actions. Our method uses a quantum version of the classic normalized advantage function (QNAF), only needing the Q-value network created by a quantum neural network and avoiding any policy network. We implemented the method by TensorFlow framework. When tested against standard Gym benchmarks, QNAF outperforms classical NAF and prior quantum methods in terms of fewer adjustable parameters. Furthermore, it shows improved stability, reliably converging regardless of changes in initial random parameters.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 IEEE International Conference on Quantum Software (QSW)

自引率

0.00%

发文量