开放量子系统最优控制的人工智能麦克斯韦妖

IF 5.6 2区物理与天体物理 Q1 PHYSICS, MULTIDISCIPLINARY

Quantum Science and Technology Pub Date : 2025-03-27 DOI:10.1088/2058-9565/adbccf

Paolo A Erdman, Robert Czupryniak, Bibek Bhandari, Andrew N Jordan, Frank Noé, Jens Eisert and Giacomo Guarnieri

{"title":"开放量子系统最优控制的人工智能麦克斯韦妖","authors":"Paolo A Erdman, Robert Czupryniak, Bibek Bhandari, Andrew N Jordan, Frank Noé, Jens Eisert and Giacomo Guarnieri","doi":"10.1088/2058-9565/adbccf","DOIUrl":null,"url":null,"abstract":"Feedback control of open quantum systems is of fundamental importance for practical applications in various contexts, ranging from quantum computation to quantum error correction and quantum metrology. Its use in the context of thermodynamics further enables the study of the interplay between information and energy. However, deriving optimal feedback control strategies is highly challenging, as it involves the optimal control of open quantum systems, the stochastic nature of quantum measurement, and the inclusion of policies that maximize a long-term time- and trajectory-averaged goal. In this work, we employ a reinforcement learning approach to automate and capture the role of a quantum Maxwell’s demon: the agent takes the literal role of discovering optimal feedback control strategies in qubit-based systems that maximize a trade-off between measurement-powered cooling and measurement efficiency. Considering weak or projective quantum measurements, we explore different regimes based on the ordering between the thermalization, the measurement, and the unitary feedback timescales, finding different and highly non-intuitive, yet interpretable, strategies. In the thermalization-dominated regime, we find strategies with elaborate finite-time thermalization protocols conditioned on measurement outcomes. In the measurement-dominated regime, we find that optimal strategies involve adaptively measuring different qubit observables reflecting the acquired information, and repeating multiple weak measurements until the quantum state is ‘sufficiently pure’, leading to random walks in state space. Finally, we study the case when all timescales are comparable, finding new feedback control strategies that considerably outperform more intuitive ones. We discuss a two-qubit example where we explore the role of entanglement and conclude discussing the scaling of our results to quantum many-body systems.","PeriodicalId":20821,"journal":{"name":"Quantum Science and Technology","volume":"35 1","pages":""},"PeriodicalIF":5.6000,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Artificially intelligent Maxwell’s demon for optimal control of open quantum systems\",\"authors\":\"Paolo A Erdman, Robert Czupryniak, Bibek Bhandari, Andrew N Jordan, Frank Noé, Jens Eisert and Giacomo Guarnieri\",\"doi\":\"10.1088/2058-9565/adbccf\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Feedback control of open quantum systems is of fundamental importance for practical applications in various contexts, ranging from quantum computation to quantum error correction and quantum metrology. Its use in the context of thermodynamics further enables the study of the interplay between information and energy. However, deriving optimal feedback control strategies is highly challenging, as it involves the optimal control of open quantum systems, the stochastic nature of quantum measurement, and the inclusion of policies that maximize a long-term time- and trajectory-averaged goal. In this work, we employ a reinforcement learning approach to automate and capture the role of a quantum Maxwell’s demon: the agent takes the literal role of discovering optimal feedback control strategies in qubit-based systems that maximize a trade-off between measurement-powered cooling and measurement efficiency. Considering weak or projective quantum measurements, we explore different regimes based on the ordering between the thermalization, the measurement, and the unitary feedback timescales, finding different and highly non-intuitive, yet interpretable, strategies. In the thermalization-dominated regime, we find strategies with elaborate finite-time thermalization protocols conditioned on measurement outcomes. In the measurement-dominated regime, we find that optimal strategies involve adaptively measuring different qubit observables reflecting the acquired information, and repeating multiple weak measurements until the quantum state is ‘sufficiently pure’, leading to random walks in state space. Finally, we study the case when all timescales are comparable, finding new feedback control strategies that considerably outperform more intuitive ones. We discuss a two-qubit example where we explore the role of entanglement and conclude discussing the scaling of our results to quantum many-body systems.\",\"PeriodicalId\":20821,\"journal\":{\"name\":\"Quantum Science and Technology\",\"volume\":\"35 1\",\"pages\":\"\"},\"PeriodicalIF\":5.6000,\"publicationDate\":\"2025-03-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Quantum Science and Technology\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://doi.org/10.1088/2058-9565/adbccf\",\"RegionNum\":2,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PHYSICS, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Quantum Science and Technology","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1088/2058-9565/adbccf","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PHYSICS, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

摘要

开放量子系统的反馈控制对于从量子计算到量子误差校正和量子计量等各种实际应用具有重要意义。它在热力学背景下的使用进一步使研究信息和能量之间的相互作用成为可能。然而，导出最优反馈控制策略是极具挑战性的，因为它涉及开放量子系统的最优控制，量子测量的随机性，以及包含最大化长期时间和轨迹平均目标的策略。在这项工作中，我们采用强化学习方法来自动化和捕获量子麦克斯韦妖的角色：代理在基于量子位的系统中发现最佳反馈控制策略，从而最大限度地在测量驱动的冷却和测量效率之间进行权衡。考虑到弱或投影量子测量，我们基于热化、测量和统一反馈时间尺度之间的顺序探索了不同的机制，找到了不同的、高度非直觉的、但可解释的策略。在热化为主的制度，我们发现策略与精心制作的有限时间热化协议条件下的测量结果。在测量主导的状态下，我们发现最优策略包括自适应地测量反映所获取信息的不同量子位观测值，并重复多次弱测量，直到量子态“足够纯粹”，导致状态空间中的随机游走。最后，我们研究了所有时间尺度都具有可比性的情况，发现新的反馈控制策略明显优于更直观的控制策略。我们讨论了一个双量子位的例子，在这个例子中我们探讨了纠缠的作用，最后讨论了我们的结果对量子多体系统的缩放。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Artificially intelligent Maxwell’s demon for optimal control of open quantum systems

Feedback control of open quantum systems is of fundamental importance for practical applications in various contexts, ranging from quantum computation to quantum error correction and quantum metrology. Its use in the context of thermodynamics further enables the study of the interplay between information and energy. However, deriving optimal feedback control strategies is highly challenging, as it involves the optimal control of open quantum systems, the stochastic nature of quantum measurement, and the inclusion of policies that maximize a long-term time- and trajectory-averaged goal. In this work, we employ a reinforcement learning approach to automate and capture the role of a quantum Maxwell’s demon: the agent takes the literal role of discovering optimal feedback control strategies in qubit-based systems that maximize a trade-off between measurement-powered cooling and measurement efficiency. Considering weak or projective quantum measurements, we explore different regimes based on the ordering between the thermalization, the measurement, and the unitary feedback timescales, finding different and highly non-intuitive, yet interpretable, strategies. In the thermalization-dominated regime, we find strategies with elaborate finite-time thermalization protocols conditioned on measurement outcomes. In the measurement-dominated regime, we find that optimal strategies involve adaptively measuring different qubit observables reflecting the acquired information, and repeating multiple weak measurements until the quantum state is ‘sufficiently pure’, leading to random walks in state space. Finally, we study the case when all timescales are comparable, finding new feedback control strategies that considerably outperform more intuitive ones. We discuss a two-qubit example where we explore the role of entanglement and conclude discussing the scaling of our results to quantum many-body systems.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Quantum Science and Technology Materials Science-Materials Science (miscellaneous)

CiteScore

11.20

自引率

3.00%

发文量

133

期刊介绍： Driven by advances in technology and experimental capability, the last decade has seen the emergence of quantum technology: a new praxis for controlling the quantum world. It is now possible to engineer complex, multi-component systems that merge the once distinct fields of quantum optics and condensed matter physics. Quantum Science and Technology is a new multidisciplinary, electronic-only journal, devoted to publishing research of the highest quality and impact covering theoretical and experimental advances in the fundamental science and application of all quantum-enabled technologies.