Measurement-Informed Safe Reinforcement Learning for Quantum Battery Charging via Harmonic-Syndrome Diagnostics and BMS Constraints

IF 4.6

IEEE Transactions on Quantum Engineering Pub Date : 2026-01-01 Epub Date: 2026-03-03 DOI:10.1109/TQE.2026.3670136

Sangkeum Lee;Beomdo Park;Junseong Park;Hyeonseok Jang;Hoon Jeong;Taewook Heo

{"title":"Measurement-Informed Safe Reinforcement Learning for Quantum Battery Charging via Harmonic-Syndrome Diagnostics and BMS Constraints","authors":"Sangkeum Lee;Beomdo Park;Junseong Park;Hyeonseok Jang;Hoon Jeong;Taewook Heo","doi":"10.1109/TQE.2026.3670136","DOIUrl":null,"url":null,"abstract":"Quantum batteries promise ultrafast energy storage but are highly sensitive to noise, drift, and hardware constraints, making safe high-performance charging a central challenge for noisy intermediate-scale quantum devices. We propose a measurement-informed safe control framework that couples harmonic-spectrum-based syndrome diagnostics—<inline-formula><tex-math>$H_{2}/H_{1}$</tex-math></inline-formula>, <inline-formula><tex-math>$H_{3}/H_{1}$</tex-math></inline-formula>, and frequency drift—with a battery management system (BMS)-constrained curriculum reinforcement learning (RL) policy. Spectral features are compressed into a three-level syndrome code (<inline-formula><tex-math>$s\\in \\lbrace 0,1,2\\rbrace$</tex-math></inline-formula>) that serves as a real-time hardware risk proxy for the controller. Our digital-twin simulator incorporates <inline-formula><tex-math>$T_{1}/T_\\phi$</tex-math></inline-formula> relaxation, crosstalk, collective effects, and terminal-voltage dynamics, while safety risks are explicitly encoded as BMS-related penalties (state-of-health, voltage limits, and high-risk operation ratio) in the RL reward. Across staged curricula of increasing system complexity, the learned policy empirically traces a strictly improved Pareto frontier between final ergotropy and high-risk ratio compared to baseline and threshold-grid control strategies, with gains confirmed by multiseed statistical confidence intervals. To support near-term deployment, we position the current work as a digital-twin stage and outline a concrete simulation-to-real protocol: fix receiver-operating-characteristic-calibrated thresholds, retune <inline-formula><tex-math>$(\\tau ^{w},\\tau ^{h})$</tex-math></inline-formula> on a small hardware calibration split, and validate a one-step voltage shield. We further demonstrate the framework on a benchtop transmon setup with <inline-formula><tex-math>$N=1$</tex-math></inline-formula>–2, reporting shield trigger/violation rates, sim-to-real drift of spectral features Kullback–Leibler divergence/earth mover's distance (KL)/(EMD), and an end-to-end latency within 20 <inline-formula><tex-math>$\\mathrm{\\mu }$</tex-math></inline-formula> <inline-formula><tex-math>$\\mathrm{s}$</tex-math></inline-formula>, indicating that harmonic-syndrome-informed safe RL is a viable route toward practical quantum battery charging control.","PeriodicalId":100644,"journal":{"name":"IEEE Transactions on Quantum Engineering","volume":"7 ","pages":"1-15"},"PeriodicalIF":4.6000,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11419848","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Quantum Engineering","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11419848/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/3/3 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Quantum batteries promise ultrafast energy storage but are highly sensitive to noise, drift, and hardware constraints, making safe high-performance charging a central challenge for noisy intermediate-scale quantum devices. We propose a measurement-informed safe control framework that couples harmonic-spectrum-based syndrome diagnostics—

$H_{2}/H_{1}$

$H_{3}/H_{1}$

, and frequency drift—with a battery management system (BMS)-constrained curriculum reinforcement learning (RL) policy. Spectral features are compressed into a three-level syndrome code (

$s\in \lbrace 0,1,2\rbrace$

) that serves as a real-time hardware risk proxy for the controller. Our digital-twin simulator incorporates

$T_{1}/T_\phi$

relaxation, crosstalk, collective effects, and terminal-voltage dynamics, while safety risks are explicitly encoded as BMS-related penalties (state-of-health, voltage limits, and high-risk operation ratio) in the RL reward. Across staged curricula of increasing system complexity, the learned policy empirically traces a strictly improved Pareto frontier between final ergotropy and high-risk ratio compared to baseline and threshold-grid control strategies, with gains confirmed by multiseed statistical confidence intervals. To support near-term deployment, we position the current work as a digital-twin stage and outline a concrete simulation-to-real protocol: fix receiver-operating-characteristic-calibrated thresholds, retune

$(\tau ^{w},\tau ^{h})$

on a small hardware calibration split, and validate a one-step voltage shield. We further demonstrate the framework on a benchtop transmon setup with

$N=1$

–2, reporting shield trigger/violation rates, sim-to-real drift of spectral features Kullback–Leibler divergence/earth mover's distance (KL)/(EMD), and an end-to-end latency within 20

$\mathrm{\mu }$

$\mathrm{s}$

, indicating that harmonic-syndrome-informed safe RL is a viable route toward practical quantum battery charging control.

查看原文本刊更多论文

基于谐波综合征诊断和BMS约束的量子电池充电测量信息安全强化学习

量子电池有望实现超快的能量存储，但对噪声、漂移和硬件限制高度敏感，这使得安全高性能充电成为嘈杂的中等规模量子设备的核心挑战。我们提出了一个测量知情的安全控制框架，该框架将基于谐波频谱的综合征诊断（$H_{2}/H_{1}$, $H_{3}/H_{1}$和频率漂移）与电池管理系统（BMS）约束的课程强化学习（RL）策略相结合。频谱特征被压缩成三级综合征代码（$s\in \lbrace 0,1,2\rbrace$），作为控制器的实时硬件风险代理。我们的数字孪生模拟器包含$T_{1}/T_\phi$放松、串音、集体效应和终端电压动态，而安全风险在RL奖励中被明确编码为bms相关的惩罚（健康状态、电压限制和高风险操作比率）。在系统复杂性不断增加的阶段课程中，与基线和阈值网格控制策略相比，学习策略经验地追踪了最终自抗性和高风险比率之间严格改进的帕累托边界，其收益由多种子统计置信区间证实。为了支持近期部署，我们将当前的工作定位为数字孪生阶段，并概述了一个具体的模拟到实际的协议：固定接收器工作特性校准阈值，在小型硬件校准分裂上返回$(\tau ^{w},\tau ^{h})$，并验证一步电压屏蔽。我们进一步在一个具有$N=1$ -2、报告屏蔽触发/违规率、光谱特征kullbackleibler散度/土动器距离(KL)/(EMD)的模拟到真实漂移以及20 $\mathrm{\mu }$$\mathrm{s}$内的端到端延迟的台式发射机设置上演示了该框架，表明谐波证型信息安全RL是实现实际量子电池充电控制的可行途径。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Quantum Engineering

CiteScore

8.00

自引率

0.00%

发文量