{"title":"量子与经典深度q学习方法在动态环境控制中的性能比较","authors":"Aramchehr Zare, Mehrdad Boroushaki","doi":"10.1140/epjqt/s40507-025-00381-y","DOIUrl":null,"url":null,"abstract":"<div><p>There is a lack of adequate studies on dynamic environments control for Quantum Reinforcement Learning (QRL) algorithms, representing a significant gap in this field. This study contributes to bridging this gap by demonstrating the potential of quantum RL algorithms to effectively handle dynamic environments. In this research, the performance and robustness of Quantum Deep Q-learning Networks (DQN) were examined in two dynamic environments, Cart Pole and Lunar Lander, by using three distinct quantum Ansatz layers: RealAmplitudes, EfficientSU2, and TwoLocal. The quantum DQNs were compared with classical DQN algorithms in terms of convergence speed, loss minimization, and Q-value behavior. It was observed that the RealAmplitudes Ansatz outperformed the other quantum circuits, demonstrating faster convergence and superior performance in minimizing the loss function. To assess robustness, the pole length was increased in the Cart Pole environment, and a wind function was added to the Lunar Lander environment after the 50th episode. All three quantum Ansatz layers were found to maintain robust performance under disturbed conditions, with consistent reward values, loss minimization, and stable Q-value distributions. Although the proposed QRL demonstrates competitive results overall, classical RL can surpass them in convergence speed under specific conditions.</p></div>","PeriodicalId":547,"journal":{"name":"EPJ Quantum Technology","volume":"12 1","pages":""},"PeriodicalIF":5.6000,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://epjquantumtechnology.springeropen.com/counter/pdf/10.1140/epjqt/s40507-025-00381-y","citationCount":"0","resultStr":"{\"title\":\"Performance comparison of the quantum and classical deep Q-learning approaches in dynamic environments control\",\"authors\":\"Aramchehr Zare, Mehrdad Boroushaki\",\"doi\":\"10.1140/epjqt/s40507-025-00381-y\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>There is a lack of adequate studies on dynamic environments control for Quantum Reinforcement Learning (QRL) algorithms, representing a significant gap in this field. This study contributes to bridging this gap by demonstrating the potential of quantum RL algorithms to effectively handle dynamic environments. In this research, the performance and robustness of Quantum Deep Q-learning Networks (DQN) were examined in two dynamic environments, Cart Pole and Lunar Lander, by using three distinct quantum Ansatz layers: RealAmplitudes, EfficientSU2, and TwoLocal. The quantum DQNs were compared with classical DQN algorithms in terms of convergence speed, loss minimization, and Q-value behavior. It was observed that the RealAmplitudes Ansatz outperformed the other quantum circuits, demonstrating faster convergence and superior performance in minimizing the loss function. To assess robustness, the pole length was increased in the Cart Pole environment, and a wind function was added to the Lunar Lander environment after the 50th episode. All three quantum Ansatz layers were found to maintain robust performance under disturbed conditions, with consistent reward values, loss minimization, and stable Q-value distributions. Although the proposed QRL demonstrates competitive results overall, classical RL can surpass them in convergence speed under specific conditions.</p></div>\",\"PeriodicalId\":547,\"journal\":{\"name\":\"EPJ Quantum Technology\",\"volume\":\"12 1\",\"pages\":\"\"},\"PeriodicalIF\":5.6000,\"publicationDate\":\"2025-06-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://epjquantumtechnology.springeropen.com/counter/pdf/10.1140/epjqt/s40507-025-00381-y\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"EPJ Quantum Technology\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://link.springer.com/article/10.1140/epjqt/s40507-025-00381-y\",\"RegionNum\":2,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"OPTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"EPJ Quantum Technology","FirstCategoryId":"101","ListUrlMain":"https://link.springer.com/article/10.1140/epjqt/s40507-025-00381-y","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OPTICS","Score":null,"Total":0}
Performance comparison of the quantum and classical deep Q-learning approaches in dynamic environments control
There is a lack of adequate studies on dynamic environments control for Quantum Reinforcement Learning (QRL) algorithms, representing a significant gap in this field. This study contributes to bridging this gap by demonstrating the potential of quantum RL algorithms to effectively handle dynamic environments. In this research, the performance and robustness of Quantum Deep Q-learning Networks (DQN) were examined in two dynamic environments, Cart Pole and Lunar Lander, by using three distinct quantum Ansatz layers: RealAmplitudes, EfficientSU2, and TwoLocal. The quantum DQNs were compared with classical DQN algorithms in terms of convergence speed, loss minimization, and Q-value behavior. It was observed that the RealAmplitudes Ansatz outperformed the other quantum circuits, demonstrating faster convergence and superior performance in minimizing the loss function. To assess robustness, the pole length was increased in the Cart Pole environment, and a wind function was added to the Lunar Lander environment after the 50th episode. All three quantum Ansatz layers were found to maintain robust performance under disturbed conditions, with consistent reward values, loss minimization, and stable Q-value distributions. Although the proposed QRL demonstrates competitive results overall, classical RL can surpass them in convergence speed under specific conditions.
期刊介绍:
Driven by advances in technology and experimental capability, the last decade has seen the emergence of quantum technology: a new praxis for controlling the quantum world. It is now possible to engineer complex, multi-component systems that merge the once distinct fields of quantum optics and condensed matter physics.
EPJ Quantum Technology covers theoretical and experimental advances in subjects including but not limited to the following:
Quantum measurement, metrology and lithography
Quantum complex systems, networks and cellular automata
Quantum electromechanical systems
Quantum optomechanical systems
Quantum machines, engineering and nanorobotics
Quantum control theory
Quantum information, communication and computation
Quantum thermodynamics
Quantum metamaterials
The effect of Casimir forces on micro- and nano-electromechanical systems
Quantum biology
Quantum sensing
Hybrid quantum systems
Quantum simulations.