Robust quantum control using reinforcement learning from demonstration

IF 6.6 1区 物理与天体物理 Q1 PHYSICS, APPLIED
Shengyong Li, Yidian Fan, Xiang Li, Xinhui Ruan, Qianchuan Zhao, Zhihui Peng, Re-Bing Wu, Jing Zhang, Pengtao Song
{"title":"Robust quantum control using reinforcement learning from demonstration","authors":"Shengyong Li, Yidian Fan, Xiang Li, Xinhui Ruan, Qianchuan Zhao, Zhihui Peng, Re-Bing Wu, Jing Zhang, Pengtao Song","doi":"10.1038/s41534-025-01065-2","DOIUrl":null,"url":null,"abstract":"<p>Quantum control requires high-precision and robust control pulses to ensure optimal system performance. However, control sequences generated with a system model may suffer from model bias, leading to low fidelity. While model-free reinforcement learning (RL) methods have been developed to avoid such biases, training an RL agent from scratch can be time-consuming, often taking hours to gather enough samples for convergence. This challenge has hindered the broad application of RL techniques to larger and more complex quantum control issues, limiting their adaptability. In this work, we use Reinforcement Learning from Demonstration (RLfD) to leverage the control sequences generated with system models and further optimize them with RL to avoid model bias. By avoiding learning from scratch and starting with reasonable control pulse shapes, this approach can increase sample efficiency by reducing the number of samples, which can significantly reduce the training time. Thus, this method can effectively handle pulse shapes that are discretized into more than 1000 pieces without compromising final fidelity. We have simulated the preparation of several high-fidelity non-classical states using the RLfD method. We also find that the training process is more stable when using RLfD. In addition, this method is suitable for fast gate calibration using reinforcement learning.</p>","PeriodicalId":19212,"journal":{"name":"npj Quantum Information","volume":"11 1","pages":""},"PeriodicalIF":6.6000,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"npj Quantum Information","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1038/s41534-025-01065-2","RegionNum":1,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PHYSICS, APPLIED","Score":null,"Total":0}
引用次数: 0

Abstract

Quantum control requires high-precision and robust control pulses to ensure optimal system performance. However, control sequences generated with a system model may suffer from model bias, leading to low fidelity. While model-free reinforcement learning (RL) methods have been developed to avoid such biases, training an RL agent from scratch can be time-consuming, often taking hours to gather enough samples for convergence. This challenge has hindered the broad application of RL techniques to larger and more complex quantum control issues, limiting their adaptability. In this work, we use Reinforcement Learning from Demonstration (RLfD) to leverage the control sequences generated with system models and further optimize them with RL to avoid model bias. By avoiding learning from scratch and starting with reasonable control pulse shapes, this approach can increase sample efficiency by reducing the number of samples, which can significantly reduce the training time. Thus, this method can effectively handle pulse shapes that are discretized into more than 1000 pieces without compromising final fidelity. We have simulated the preparation of several high-fidelity non-classical states using the RLfD method. We also find that the training process is more stable when using RLfD. In addition, this method is suitable for fast gate calibration using reinforcement learning.

Abstract Image

基于强化学习的鲁棒量子控制
量子控制需要高精度和鲁棒的控制脉冲,以确保最佳的系统性能。然而,由系统模型生成的控制序列可能会受到模型偏差的影响,从而导致低保真度。虽然已经开发了无模型强化学习(RL)方法来避免这种偏差,但从头开始训练RL代理可能很耗时,通常需要数小时才能收集足够的样本进行收敛。这一挑战阻碍了RL技术在更大、更复杂的量子控制问题上的广泛应用,限制了它们的适应性。在这项工作中,我们使用从演示中强化学习(RLfD)来利用由系统模型生成的控制序列,并使用RL进一步优化它们以避免模型偏差。通过避免从头开始学习,从合理的控制脉冲形状开始,该方法可以通过减少样本数量来提高样本效率,从而显著减少训练时间。因此,该方法可以有效地处理被离散成1000多个片段的脉冲形状,而不会影响最终的保真度。我们用RLfD方法模拟了几种高保真非经典态的制备。我们还发现,当使用RLfD时,训练过程更加稳定。此外,该方法适用于基于强化学习的快速门校正。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
npj Quantum Information
npj Quantum Information Computer Science-Computer Science (miscellaneous)
CiteScore
13.70
自引率
3.90%
发文量
130
审稿时长
29 weeks
期刊介绍: The scope of npj Quantum Information spans across all relevant disciplines, fields, approaches and levels and so considers outstanding work ranging from fundamental research to applications and technologies.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信