月球多能系统的智能无线电力调度：用于实时自适应波束转向和车辆到电网能量优化的深度强化学习

IF 1.9 4区工程技术 Q3 ENGINEERING, ELECTRICAL & ELECTRONIC

International Transactions on Electrical Energy Systems Pub Date : 2025-06-08 DOI:10.1155/etep/9877968

Thomas Tongxin Li, Shuangqi Li, Cynthia Xin Ding, Zhaoyao Bao, Mohannad Alhazmi

{"title":"月球多能系统的智能无线电力调度：用于实时自适应波束转向和车辆到电网能量优化的深度强化学习","authors":"Thomas Tongxin Li, Shuangqi Li, Cynthia Xin Ding, Zhaoyao Bao, Mohannad Alhazmi","doi":"10.1155/etep/9877968","DOIUrl":null,"url":null,"abstract":"<div>\n <p>The integration of wireless power transfer (WPT) and vehicle-to-grid (V2G) technologies is essential for the sustainable operation of lunar multienergy virtual power plants (MEVPPs), where rovers, habitats, and in situ resource utilization (ISRU) facilities rely on adaptive energy management. Unlike terrestrial systems, lunar environments present extreme challenges, including long-duration night cycles, regolith dust accumulation, severe temperature fluctuations, and dynamic rover mobility, all of which disrupt efficient power delivery. This paper proposes a reinforcement learning–based adaptive beam steering framework to optimize WPT scheduling, ensuring continuous and efficient energy transmission for both mobile and stationary lunar assets. Unlike traditional fixed-beam or heuristic-based WPT methods, the proposed system utilizes deep reinforcement learning (DRL) with proximal policy optimization (PPO) to autonomously adjust beam direction, power intensity, and charging priority in response to real-time rover movements, V2G interactions, and fluctuating energy demands. The proposed framework models WPT optimization as a Markov decision process (MDP), where the agent learns to dynamically adapt beam steering based on rover speed, response delay, solar power availability, and charging station congestion. The reward function penalizes energy misallocation and misalignment losses while maximizing charging efficiency and systemwide energy resilience. A case study simulating a 30-day mission near Shackleton Crater evaluates the effectiveness of the AI–driven WPT system, demonstrating a 54.6% reduction in energy downtime and a 41.3% improvement in beam alignment efficiency compared to static power scheduling methods. In addition, the system reduces latency-induced power deficits by 39.8%, ensuring reliable power distribution for ISRU oxygen extraction, habitat life support, and rover recharging stations. This study represents a novel advancement in lunar power infrastructure, integrating AI–driven adaptive WPT with intelligent energy scheduling to enhance V2G interactions in extraterrestrial environments. The results validate the feasibility of DRL–based WPT control, paving the way for scalable, resilient, and self-optimizing wireless power grids on the Moon. Future work will explore the integration of hybrid energy storage models, quantum-inspired optimization for real-time decision-making, and predictive beamforming algorithms to further enhance the reliability and efficiency of lunar energy networks.</p>\n </div>","PeriodicalId":51293,"journal":{"name":"International Transactions on Electrical Energy Systems","volume":"2025 1","pages":""},"PeriodicalIF":1.9000,"publicationDate":"2025-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1155/etep/9877968","citationCount":"0","resultStr":"{\"title\":\"Intelligent Wireless Power Scheduling for Lunar Multienergy Systems: Deep Reinforcement Learning for Real-Time Adaptive Beam Steering and Vehicle-to-Grid Energy Optimization\",\"authors\":\"Thomas Tongxin Li, Shuangqi Li, Cynthia Xin Ding, Zhaoyao Bao, Mohannad Alhazmi\",\"doi\":\"10.1155/etep/9877968\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n <p>The integration of wireless power transfer (WPT) and vehicle-to-grid (V2G) technologies is essential for the sustainable operation of lunar multienergy virtual power plants (MEVPPs), where rovers, habitats, and in situ resource utilization (ISRU) facilities rely on adaptive energy management. Unlike terrestrial systems, lunar environments present extreme challenges, including long-duration night cycles, regolith dust accumulation, severe temperature fluctuations, and dynamic rover mobility, all of which disrupt efficient power delivery. This paper proposes a reinforcement learning–based adaptive beam steering framework to optimize WPT scheduling, ensuring continuous and efficient energy transmission for both mobile and stationary lunar assets. Unlike traditional fixed-beam or heuristic-based WPT methods, the proposed system utilizes deep reinforcement learning (DRL) with proximal policy optimization (PPO) to autonomously adjust beam direction, power intensity, and charging priority in response to real-time rover movements, V2G interactions, and fluctuating energy demands. The proposed framework models WPT optimization as a Markov decision process (MDP), where the agent learns to dynamically adapt beam steering based on rover speed, response delay, solar power availability, and charging station congestion. The reward function penalizes energy misallocation and misalignment losses while maximizing charging efficiency and systemwide energy resilience. A case study simulating a 30-day mission near Shackleton Crater evaluates the effectiveness of the AI–driven WPT system, demonstrating a 54.6% reduction in energy downtime and a 41.3% improvement in beam alignment efficiency compared to static power scheduling methods. In addition, the system reduces latency-induced power deficits by 39.8%, ensuring reliable power distribution for ISRU oxygen extraction, habitat life support, and rover recharging stations. This study represents a novel advancement in lunar power infrastructure, integrating AI–driven adaptive WPT with intelligent energy scheduling to enhance V2G interactions in extraterrestrial environments. The results validate the feasibility of DRL–based WPT control, paving the way for scalable, resilient, and self-optimizing wireless power grids on the Moon. Future work will explore the integration of hybrid energy storage models, quantum-inspired optimization for real-time decision-making, and predictive beamforming algorithms to further enhance the reliability and efficiency of lunar energy networks.</p>\\n </div>\",\"PeriodicalId\":51293,\"journal\":{\"name\":\"International Transactions on Electrical Energy Systems\",\"volume\":\"2025 1\",\"pages\":\"\"},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2025-06-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1155/etep/9877968\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Transactions on Electrical Energy Systems\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1155/etep/9877968\",\"RegionNum\":4,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Transactions on Electrical Energy Systems","FirstCategoryId":"5","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1155/etep/9877968","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

无线电力传输（WPT）和车辆到电网（V2G）技术的集成对于月球多能虚拟发电厂（mevpp）的可持续运行至关重要，其中月测车、栖息地和原位资源利用（ISRU）设施依赖于自适应能源管理。与地面系统不同，月球环境面临着极端的挑战，包括长时间的夜间周期、风化层灰尘积聚、严重的温度波动和动态的月球车机动性，所有这些都破坏了有效的电力输送。本文提出了一种基于强化学习的自适应波束导向框架，以优化WPT调度，确保移动和固定月球资产的连续高效能量传输。与传统的固定波束或基于启发式的WPT方法不同，该系统利用深度强化学习（DRL）和近端策略优化（PPO）来自主调整波束方向、功率强度和充电优先级，以响应实时漫游车运动、V2G交互和波动的能量需求。提出的框架将WPT优化建模为马尔可夫决策过程（MDP），其中智能体学习基于探测车速度、响应延迟、太阳能可用性和充电站拥塞动态适应波束转向。奖励函数惩罚能量分配不当和错位损失，同时最大化充电效率和系统范围的能量弹性。在沙克尔顿环形山附近模拟30天任务的案例研究中，评估了人工智能驱动的WPT系统的有效性，表明与静态电力调度方法相比，能源停机时间减少了54.6%，波束对准效率提高了41.3%。此外，该系统将延迟引起的功率不足降低了39.8%，确保了ISRU氧气提取、栖息地生命支持和漫游车充电站的可靠电力分配。该研究代表了月球电力基础设施的新进展，将人工智能驱动的自适应WPT与智能能源调度相结合，以增强地外环境下的V2G交互。结果验证了基于drl的WPT控制的可行性，为月球上可扩展、弹性和自优化的无线电网铺平了道路。未来的工作将探索混合储能模型、量子激励的实时决策优化和预测波束形成算法的集成，以进一步提高月球能源网络的可靠性和效率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Intelligent Wireless Power Scheduling for Lunar Multienergy Systems: Deep Reinforcement Learning for Real-Time Adaptive Beam Steering and Vehicle-to-Grid Energy Optimization

查看原文本刊更多论文

Intelligent Wireless Power Scheduling for Lunar Multienergy Systems: Deep Reinforcement Learning for Real-Time Adaptive Beam Steering and Vehicle-to-Grid Energy Optimization

The integration of wireless power transfer (WPT) and vehicle-to-grid (V2G) technologies is essential for the sustainable operation of lunar multienergy virtual power plants (MEVPPs), where rovers, habitats, and in situ resource utilization (ISRU) facilities rely on adaptive energy management. Unlike terrestrial systems, lunar environments present extreme challenges, including long-duration night cycles, regolith dust accumulation, severe temperature fluctuations, and dynamic rover mobility, all of which disrupt efficient power delivery. This paper proposes a reinforcement learning–based adaptive beam steering framework to optimize WPT scheduling, ensuring continuous and efficient energy transmission for both mobile and stationary lunar assets. Unlike traditional fixed-beam or heuristic-based WPT methods, the proposed system utilizes deep reinforcement learning (DRL) with proximal policy optimization (PPO) to autonomously adjust beam direction, power intensity, and charging priority in response to real-time rover movements, V2G interactions, and fluctuating energy demands. The proposed framework models WPT optimization as a Markov decision process (MDP), where the agent learns to dynamically adapt beam steering based on rover speed, response delay, solar power availability, and charging station congestion. The reward function penalizes energy misallocation and misalignment losses while maximizing charging efficiency and systemwide energy resilience. A case study simulating a 30-day mission near Shackleton Crater evaluates the effectiveness of the AI–driven WPT system, demonstrating a 54.6% reduction in energy downtime and a 41.3% improvement in beam alignment efficiency compared to static power scheduling methods. In addition, the system reduces latency-induced power deficits by 39.8%, ensuring reliable power distribution for ISRU oxygen extraction, habitat life support, and rover recharging stations. This study represents a novel advancement in lunar power infrastructure, integrating AI–driven adaptive WPT with intelligent energy scheduling to enhance V2G interactions in extraterrestrial environments. The results validate the feasibility of DRL–based WPT control, paving the way for scalable, resilient, and self-optimizing wireless power grids on the Moon. Future work will explore the integration of hybrid energy storage models, quantum-inspired optimization for real-time decision-making, and predictive beamforming algorithms to further enhance the reliability and efficiency of lunar energy networks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Transactions on Electrical Energy Systems ENGINEERING, ELECTRICAL & ELECTRONIC-

CiteScore

6.70

自引率

8.70%

发文量

342

期刊介绍： International Transactions on Electrical Energy Systems publishes original research results on key advances in the generation, transmission, and distribution of electrical energy systems. Of particular interest are submissions concerning the modeling, analysis, optimization and control of advanced electric power systems. Manuscripts on topics of economics, finance, policies, insulation materials, low-voltage power electronics, plasmas, and magnetics will generally not be considered for review.