Performance comparison of the quantum and classical deep Q-learning approaches in dynamic environments control

IF 5.6 2区 物理与天体物理 Q1 OPTICS
Aramchehr Zare, Mehrdad Boroushaki
{"title":"Performance comparison of the quantum and classical deep Q-learning approaches in dynamic environments control","authors":"Aramchehr Zare,&nbsp;Mehrdad Boroushaki","doi":"10.1140/epjqt/s40507-025-00381-y","DOIUrl":null,"url":null,"abstract":"<div><p>There is a lack of adequate studies on dynamic environments control for Quantum Reinforcement Learning (QRL) algorithms, representing a significant gap in this field. This study contributes to bridging this gap by demonstrating the potential of quantum RL algorithms to effectively handle dynamic environments. In this research, the performance and robustness of Quantum Deep Q-learning Networks (DQN) were examined in two dynamic environments, Cart Pole and Lunar Lander, by using three distinct quantum Ansatz layers: RealAmplitudes, EfficientSU2, and TwoLocal. The quantum DQNs were compared with classical DQN algorithms in terms of convergence speed, loss minimization, and Q-value behavior. It was observed that the RealAmplitudes Ansatz outperformed the other quantum circuits, demonstrating faster convergence and superior performance in minimizing the loss function. To assess robustness, the pole length was increased in the Cart Pole environment, and a wind function was added to the Lunar Lander environment after the 50th episode. All three quantum Ansatz layers were found to maintain robust performance under disturbed conditions, with consistent reward values, loss minimization, and stable Q-value distributions. Although the proposed QRL demonstrates competitive results overall, classical RL can surpass them in convergence speed under specific conditions.</p></div>","PeriodicalId":547,"journal":{"name":"EPJ Quantum Technology","volume":"12 1","pages":""},"PeriodicalIF":5.6000,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://epjquantumtechnology.springeropen.com/counter/pdf/10.1140/epjqt/s40507-025-00381-y","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"EPJ Quantum Technology","FirstCategoryId":"101","ListUrlMain":"https://link.springer.com/article/10.1140/epjqt/s40507-025-00381-y","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OPTICS","Score":null,"Total":0}
引用次数: 0

Abstract

There is a lack of adequate studies on dynamic environments control for Quantum Reinforcement Learning (QRL) algorithms, representing a significant gap in this field. This study contributes to bridging this gap by demonstrating the potential of quantum RL algorithms to effectively handle dynamic environments. In this research, the performance and robustness of Quantum Deep Q-learning Networks (DQN) were examined in two dynamic environments, Cart Pole and Lunar Lander, by using three distinct quantum Ansatz layers: RealAmplitudes, EfficientSU2, and TwoLocal. The quantum DQNs were compared with classical DQN algorithms in terms of convergence speed, loss minimization, and Q-value behavior. It was observed that the RealAmplitudes Ansatz outperformed the other quantum circuits, demonstrating faster convergence and superior performance in minimizing the loss function. To assess robustness, the pole length was increased in the Cart Pole environment, and a wind function was added to the Lunar Lander environment after the 50th episode. All three quantum Ansatz layers were found to maintain robust performance under disturbed conditions, with consistent reward values, loss minimization, and stable Q-value distributions. Although the proposed QRL demonstrates competitive results overall, classical RL can surpass them in convergence speed under specific conditions.

量子与经典深度q学习方法在动态环境控制中的性能比较
在量子强化学习(QRL)算法的动态环境控制方面缺乏足够的研究,这是该领域的一个重大空白。本研究通过展示量子强化学习算法有效处理动态环境的潜力,有助于弥合这一差距。在这项研究中,通过使用三个不同的量子Ansatz层:RealAmplitudes、EfficientSU2和twollocal,在两个动态环境(Cart Pole和Lunar Lander)中测试了量子深度q学习网络(DQN)的性能和鲁棒性。量子DQN在收敛速度、损失最小化和q值行为方面与经典DQN算法进行了比较。结果表明,RealAmplitudes Ansatz优于其他量子电路,在最小化损失函数方面表现出更快的收敛速度和优越的性能。为了评估稳健性,在Cart pole环境中增加了极点长度,并在第50集后在月球着陆器环境中添加了风函数。发现所有三个量子Ansatz层在扰动条件下保持稳健的性能,具有一致的奖励值,损失最小化和稳定的q值分布。尽管本文提出的QRL在总体上表现出竞争性结果,但在特定条件下,经典RL在收敛速度上可以超越它们。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
EPJ Quantum Technology
EPJ Quantum Technology Physics and Astronomy-Atomic and Molecular Physics, and Optics
CiteScore
7.70
自引率
7.50%
发文量
28
审稿时长
71 days
期刊介绍: Driven by advances in technology and experimental capability, the last decade has seen the emergence of quantum technology: a new praxis for controlling the quantum world. It is now possible to engineer complex, multi-component systems that merge the once distinct fields of quantum optics and condensed matter physics. EPJ Quantum Technology covers theoretical and experimental advances in subjects including but not limited to the following: Quantum measurement, metrology and lithography Quantum complex systems, networks and cellular automata Quantum electromechanical systems Quantum optomechanical systems Quantum machines, engineering and nanorobotics Quantum control theory Quantum information, communication and computation Quantum thermodynamics Quantum metamaterials The effect of Casimir forces on micro- and nano-electromechanical systems Quantum biology Quantum sensing Hybrid quantum systems Quantum simulations.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信