Data-driven planning of large-scale electric vehicle charging hubs using deep reinforcement learning

IF 7.6 1区工程技术 Q1 TRANSPORTATION SCIENCE & TECHNOLOGY

Transportation Research Part C-Emerging Technologies Pub Date : 2025-05-21 DOI:10.1016/j.trc.2025.105126

Karsten Schroer , Ramin Ahadi , Wolfgang Ketter , Thomas Y. Lee

{"title":"Data-driven planning of large-scale electric vehicle charging hubs using deep reinforcement learning","authors":"Karsten Schroer , Ramin Ahadi , Wolfgang Ketter , Thomas Y. Lee","doi":"10.1016/j.trc.2025.105126","DOIUrl":null,"url":null,"abstract":"<div><div>We consider the problem of planning large-scale service systems, specifically electric vehicle (EV) charging hubs (EVCHs). EVCHs are locally concentrated clusters of charging infrastructure, e.g. in large parking lots, and are often integrated with on-site generation, storage and adjacent building infrastructure. Planning such complex operational systems over a multi-year investment horizon represents a high-dimensional, dynamic and stochastic decision problem. Such planning problems typically rely on mathematical optimization frameworks which are subject to computational challenges (e.g., NP-hardness) that can limit scalability to practical system sizes. As a result, simplifying assumptions related to, for example, temporal granularity, operational detail, system size, decision horizon or stochasticity are required to achieve tractability. Modern reinforcement learning (RL) approaches, in combination with fine-grained data-driven simulation frameworks, also known as Digital Twins (DTs), may circumvent these shortcomings. We develop a scalable soft actor-critic (SAC) reinforcement learning method, that learns near-optimal EVCH configurations against a minimum cost objective. Our method uses a highly detailed DT of the EVCH environment that is bootstrapped with unique real-world sensor data from parking lots, charging stations, office buildings, and solar generation facilities, along with microscopic simulations of practical parking and charging policies. In extensive computational experiments, we provide empirical evidence that the proposed SAC RL algorithm converges closely to the global optimum (4%–15% gap) outperforming alternative popular RL approaches such as Deep Q Networks (DQN) and Deep Deterministic Policy Gradients (DDPG). We also demonstrate the superior scalability characteristic of our method to real-world problem sizes of up to 1000 charging spots. Finally, we run scenario analyses that explore the impact of user preferences and operational choices on planning decisions, thus providing actionable and novel policy guidance for EVCH planners and operators.</div></div>","PeriodicalId":54417,"journal":{"name":"Transportation Research Part C-Emerging Technologies","volume":"177 ","pages":"Article 105126"},"PeriodicalIF":7.6000,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transportation Research Part C-Emerging Technologies","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0968090X25001305","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TRANSPORTATION SCIENCE & TECHNOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

We consider the problem of planning large-scale service systems, specifically electric vehicle (EV) charging hubs (EVCHs). EVCHs are locally concentrated clusters of charging infrastructure, e.g. in large parking lots, and are often integrated with on-site generation, storage and adjacent building infrastructure. Planning such complex operational systems over a multi-year investment horizon represents a high-dimensional, dynamic and stochastic decision problem. Such planning problems typically rely on mathematical optimization frameworks which are subject to computational challenges (e.g., NP-hardness) that can limit scalability to practical system sizes. As a result, simplifying assumptions related to, for example, temporal granularity, operational detail, system size, decision horizon or stochasticity are required to achieve tractability. Modern reinforcement learning (RL) approaches, in combination with fine-grained data-driven simulation frameworks, also known as Digital Twins (DTs), may circumvent these shortcomings. We develop a scalable soft actor-critic (SAC) reinforcement learning method, that learns near-optimal EVCH configurations against a minimum cost objective. Our method uses a highly detailed DT of the EVCH environment that is bootstrapped with unique real-world sensor data from parking lots, charging stations, office buildings, and solar generation facilities, along with microscopic simulations of practical parking and charging policies. In extensive computational experiments, we provide empirical evidence that the proposed SAC RL algorithm converges closely to the global optimum (4%–15% gap) outperforming alternative popular RL approaches such as Deep Q Networks (DQN) and Deep Deterministic Policy Gradients (DDPG). We also demonstrate the superior scalability characteristic of our method to real-world problem sizes of up to 1000 charging spots. Finally, we run scenario analyses that explore the impact of user preferences and operational choices on planning decisions, thus providing actionable and novel policy guidance for EVCH planners and operators.

查看原文本刊更多论文

基于深度强化学习的大型电动汽车充电中心数据驱动规划

我们考虑规划大规模服务系统的问题，特别是电动汽车（EV）充电中心（EVCHs）。EVCHs是当地集中的充电基础设施集群，例如在大型停车场，并且通常与现场发电，存储和邻近的建筑基础设施相结合。在多年的投资周期内规划如此复杂的操作系统代表了一个高维、动态和随机的决策问题。此类规划问题通常依赖于数学优化框架，这些框架受到计算挑战（例如，np硬度）的限制，可以限制实际系统大小的可伸缩性。因此，为了实现可追溯性，需要简化与时间粒度、操作细节、系统大小、决策范围或随机性相关的假设。现代强化学习（RL）方法与细粒度数据驱动的模拟框架（也称为数字双胞胎（dt））相结合，可能会克服这些缺点。我们开发了一种可扩展的软actor-critic （SAC）强化学习方法，该方法可以根据最小成本目标学习接近最优的EVCH配置。我们的方法使用了EVCH环境的非常详细的DT，该DT由来自停车场、充电站、办公楼和太阳能发电设施的独特真实传感器数据引导，以及实际停车和充电政策的微观模拟。在大量的计算实验中，我们提供了经验证据，表明所提出的SAC RL算法收敛于全局最优（4%-15%的差距），优于其他流行的RL方法，如深度Q网络（DQN）和深度确定性策略梯度（DDPG）。我们还证明了我们的方法具有优越的可扩展性，可以解决多达1000个充电点的实际问题。最后，我们进行情景分析，探讨用户偏好和运营选择对规划决策的影响，从而为EVCH规划者和运营商提供可操作的新政策指导。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Transportation Research Part C-Emerging Technologies 工程技术-运输科技

CiteScore

15.80

自引率

12.00%

发文量

332

审稿时长

64 days

期刊介绍： Transportation Research: Part C (TR_C) is dedicated to showcasing high-quality, scholarly research that delves into the development, applications, and implications of transportation systems and emerging technologies. Our focus lies not solely on individual technologies, but rather on their broader implications for the planning, design, operation, control, maintenance, and rehabilitation of transportation systems, services, and components. In essence, the intellectual core of the journal revolves around the transportation aspect rather than the technology itself. We actively encourage the integration of quantitative methods from diverse fields such as operations research, control systems, complex networks, computer science, and artificial intelligence. Join us in exploring the intersection of transportation systems and emerging technologies to drive innovation and progress in the field.