{"title":"基于深度强化学习算法的多速纯电动汽车能量管理策略研究","authors":"Weiwei Yang, Denghao Luo, Wenming Zhang, Nong Zhang","doi":"10.1177/09544070241275427","DOIUrl":null,"url":null,"abstract":"With increasingly prominent problems such as environmental pollution and the energy crisis, the development of pure electric vehicles has attracted more and more attention. However, the short range is still one of the main reasons affecting consumer purchases. Therefore, an optimized energy management strategy (EMS) based on the Soft Actor-Critic (SAC) and Deep Deterministic Policy Gradient (DDPG) algorithm is proposed to minimize the energy loss for multi-speed pure electric vehicles, respectively, in this paper. Vehicle speed, acceleration, and battery SOC are selected as state variables, and the action space is set to the transmission gear. The reward function takes into account energy consumption and battery life. Simulation results reveal that the proposed EMS-based SAC has a better performance compared to DDPG in the NEDC cycle, manifested explicitly in the following three aspects: (1) the battery SOC decreases from 0.8 to 0.7339 and 0.73385, and the energy consumption consumes 5264.8 and 5296.6 kJ, respectively; (2) The maximumC-rate is 1.565 and 1.566, respectively; (3) the training efficiency of SAC is higher. Therefore, the SAC-based energy management strategy proposed in this paper has a faster convergence speed and gradually approaches the optimal energy-saving effect with a smaller gap. In the WLTC condition, the SAC algorithm reduces 24.1 kJ of energy compared with DDPG, and the C-rate of SAC is below 1. The maximum value is 1.565, which aligns with the reasonable operating range of vehicle batteries. The results show that the SAC algorithm is adaptable under different working conditions.","PeriodicalId":54568,"journal":{"name":"Proceedings of the Institution of Mechanical Engineers Part D-Journal of Automobile Engineering","volume":"12 1","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Investigation of energy management strategy based on deep reinforcement learning algorithm for multi-speed pure electric vehicles\",\"authors\":\"Weiwei Yang, Denghao Luo, Wenming Zhang, Nong Zhang\",\"doi\":\"10.1177/09544070241275427\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With increasingly prominent problems such as environmental pollution and the energy crisis, the development of pure electric vehicles has attracted more and more attention. However, the short range is still one of the main reasons affecting consumer purchases. Therefore, an optimized energy management strategy (EMS) based on the Soft Actor-Critic (SAC) and Deep Deterministic Policy Gradient (DDPG) algorithm is proposed to minimize the energy loss for multi-speed pure electric vehicles, respectively, in this paper. Vehicle speed, acceleration, and battery SOC are selected as state variables, and the action space is set to the transmission gear. The reward function takes into account energy consumption and battery life. Simulation results reveal that the proposed EMS-based SAC has a better performance compared to DDPG in the NEDC cycle, manifested explicitly in the following three aspects: (1) the battery SOC decreases from 0.8 to 0.7339 and 0.73385, and the energy consumption consumes 5264.8 and 5296.6 kJ, respectively; (2) The maximumC-rate is 1.565 and 1.566, respectively; (3) the training efficiency of SAC is higher. Therefore, the SAC-based energy management strategy proposed in this paper has a faster convergence speed and gradually approaches the optimal energy-saving effect with a smaller gap. In the WLTC condition, the SAC algorithm reduces 24.1 kJ of energy compared with DDPG, and the C-rate of SAC is below 1. The maximum value is 1.565, which aligns with the reasonable operating range of vehicle batteries. The results show that the SAC algorithm is adaptable under different working conditions.\",\"PeriodicalId\":54568,\"journal\":{\"name\":\"Proceedings of the Institution of Mechanical Engineers Part D-Journal of Automobile Engineering\",\"volume\":\"12 1\",\"pages\":\"\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2024-08-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Institution of Mechanical Engineers Part D-Journal of Automobile Engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1177/09544070241275427\",\"RegionNum\":4,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, MECHANICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Institution of Mechanical Engineers Part D-Journal of Automobile Engineering","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1177/09544070241275427","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, MECHANICAL","Score":null,"Total":0}
Investigation of energy management strategy based on deep reinforcement learning algorithm for multi-speed pure electric vehicles
With increasingly prominent problems such as environmental pollution and the energy crisis, the development of pure electric vehicles has attracted more and more attention. However, the short range is still one of the main reasons affecting consumer purchases. Therefore, an optimized energy management strategy (EMS) based on the Soft Actor-Critic (SAC) and Deep Deterministic Policy Gradient (DDPG) algorithm is proposed to minimize the energy loss for multi-speed pure electric vehicles, respectively, in this paper. Vehicle speed, acceleration, and battery SOC are selected as state variables, and the action space is set to the transmission gear. The reward function takes into account energy consumption and battery life. Simulation results reveal that the proposed EMS-based SAC has a better performance compared to DDPG in the NEDC cycle, manifested explicitly in the following three aspects: (1) the battery SOC decreases from 0.8 to 0.7339 and 0.73385, and the energy consumption consumes 5264.8 and 5296.6 kJ, respectively; (2) The maximumC-rate is 1.565 and 1.566, respectively; (3) the training efficiency of SAC is higher. Therefore, the SAC-based energy management strategy proposed in this paper has a faster convergence speed and gradually approaches the optimal energy-saving effect with a smaller gap. In the WLTC condition, the SAC algorithm reduces 24.1 kJ of energy compared with DDPG, and the C-rate of SAC is below 1. The maximum value is 1.565, which aligns with the reasonable operating range of vehicle batteries. The results show that the SAC algorithm is adaptable under different working conditions.
期刊介绍:
The Journal of Automobile Engineering is an established, high quality multi-disciplinary journal which publishes the very best peer-reviewed science and engineering in the field.