基于强化学习的异构电池组最优循环

2021 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm) Pub Date : 2021-09-15 DOI:10.1109/SmartGridComm51999.2021.9632307

Vivek Deulkar, J. Nair

{"title":"基于强化学习的异构电池组最优循环","authors":"Vivek Deulkar, J. Nair","doi":"10.1109/SmartGridComm51999.2021.9632307","DOIUrl":null,"url":null,"abstract":"We consider the problem of optimal charging/discharging of a bank of heterogenous battery units, driven by stochastic electricity generation and demand processes. The batteries in the battery bank may differ with respect to their capacities, ramp constraints, losses, as well as cycling costs. The goal is to minimize the degradation costs associated with battery cycling in the long run; this is posed formally as a Markov decision process. We propose a linear function approximation based Q-learning algorithm for learning the optimal solution, using a specially designed class of kernel functions that approximate the structure of the value functions associated with the MDP. The proposed algorithm is validated via an extensive case study.","PeriodicalId":378884,"journal":{"name":"2021 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Optimal Cycling of a Heterogenous Battery Bank via Reinforcement Learning\",\"authors\":\"Vivek Deulkar, J. Nair\",\"doi\":\"10.1109/SmartGridComm51999.2021.9632307\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We consider the problem of optimal charging/discharging of a bank of heterogenous battery units, driven by stochastic electricity generation and demand processes. The batteries in the battery bank may differ with respect to their capacities, ramp constraints, losses, as well as cycling costs. The goal is to minimize the degradation costs associated with battery cycling in the long run; this is posed formally as a Markov decision process. We propose a linear function approximation based Q-learning algorithm for learning the optimal solution, using a specially designed class of kernel functions that approximate the structure of the value functions associated with the MDP. The proposed algorithm is validated via an extensive case study.\",\"PeriodicalId\":378884,\"journal\":{\"name\":\"2021 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm)\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-09-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SmartGridComm51999.2021.9632307\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SmartGridComm51999.2021.9632307","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

我们考虑了在随机发电和随机需求过程驱动下的一组异质电池单元的最优充放电问题。电池组中的电池可能在容量、斜坡限制、损耗以及循环成本方面有所不同。从长远来看，目标是最大限度地减少与电池循环相关的退化成本;这是一个正式的马尔可夫决策过程。我们提出了一种基于线性函数近似的q -学习算法，用于学习最优解，该算法使用了一类特殊设计的核函数，这些核函数近似于与MDP相关的值函数的结构。通过广泛的案例研究验证了所提出的算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Optimal Cycling of a Heterogenous Battery Bank via Reinforcement Learning

We consider the problem of optimal charging/discharging of a bank of heterogenous battery units, driven by stochastic electricity generation and demand processes. The batteries in the battery bank may differ with respect to their capacities, ramp constraints, losses, as well as cycling costs. The goal is to minimize the degradation costs associated with battery cycling in the long run; this is posed formally as a Markov decision process. We propose a linear function approximation based Q-learning algorithm for learning the optimal solution, using a specially designed class of kernel functions that approximate the structure of the value functions associated with the MDP. The proposed algorithm is validated via an extensive case study.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm)

自引率

0.00%

发文量