Caiyun Yang, Yu Zhang, Junjie Wang, Lijun He, Han Wu
{"title":"基于深度强化学习的自动化集装箱码头IGV双向任务分配与收费联合调度多智能体仿真优化方法","authors":"Caiyun Yang, Yu Zhang, Junjie Wang, Lijun He, Han Wu","doi":"10.1016/j.cor.2025.107189","DOIUrl":null,"url":null,"abstract":"<div><div>Intelligent guided vehicle (IGV) task allocation and charging scheduling at automated container terminal (ACT) are two important operational links that interact with each other. The joint scheduling problem of IGV task allocation and charging aims to improve the operational coherence and efficiency of the transportation system. In the parallelly arranged ACT, since IGV needs to enter the yard for side loading operations, the task allocation will greatly affect the empty travel distance and power consumption of IGV. In addition, the dual-cycling mode and the changes in power consumption rate under different IGV operation states make the above joint scheduling problem more complicated. In order to solve this problem, this paper uses the Markov decision process to characterize the IGV bidirectional task allocation and charging joint scheduling problem, and designs a multi-agent simulation optimization method based on deep reinforcement learning to generate a real-time adaptive scheduling solution. Considering the comprehensive impact of the agent’s long-term and short-term goals, a reward function is designed, and the deep neural network training is used to guide the IGV agent to choose the action with the largest expected cumulative reward. In addition, an adaptive double threshold charging strategy is designed, under which IGV can flexibly select the minimum charging threshold according to the system status. Finally, a multi-agent fine-grained simulation model is constructed to verify the effectiveness of the proposed method. Through comparative experiments with three single heuristic scheduling rules, different reinforcement learning algorithms and a fixed single-threshold charging strategy, it is proved that the new method can improve the operating efficiency of the system and the battery utilization of the IGV, and reduce the empty travel time of the IGV. Simulation experiments show that flexible task allocation and charging strategies can better adapt to the complex and dynamic operating environment of the ACT.</div></div>","PeriodicalId":10542,"journal":{"name":"Computers & Operations Research","volume":"183 ","pages":"Article 107189"},"PeriodicalIF":4.1000,"publicationDate":"2025-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A deep reinforcement learning based multi-agent simulation optimization approach for IGV bidirectional task allocation and charging joint scheduling in automated container terminals\",\"authors\":\"Caiyun Yang, Yu Zhang, Junjie Wang, Lijun He, Han Wu\",\"doi\":\"10.1016/j.cor.2025.107189\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Intelligent guided vehicle (IGV) task allocation and charging scheduling at automated container terminal (ACT) are two important operational links that interact with each other. The joint scheduling problem of IGV task allocation and charging aims to improve the operational coherence and efficiency of the transportation system. In the parallelly arranged ACT, since IGV needs to enter the yard for side loading operations, the task allocation will greatly affect the empty travel distance and power consumption of IGV. In addition, the dual-cycling mode and the changes in power consumption rate under different IGV operation states make the above joint scheduling problem more complicated. In order to solve this problem, this paper uses the Markov decision process to characterize the IGV bidirectional task allocation and charging joint scheduling problem, and designs a multi-agent simulation optimization method based on deep reinforcement learning to generate a real-time adaptive scheduling solution. Considering the comprehensive impact of the agent’s long-term and short-term goals, a reward function is designed, and the deep neural network training is used to guide the IGV agent to choose the action with the largest expected cumulative reward. In addition, an adaptive double threshold charging strategy is designed, under which IGV can flexibly select the minimum charging threshold according to the system status. Finally, a multi-agent fine-grained simulation model is constructed to verify the effectiveness of the proposed method. Through comparative experiments with three single heuristic scheduling rules, different reinforcement learning algorithms and a fixed single-threshold charging strategy, it is proved that the new method can improve the operating efficiency of the system and the battery utilization of the IGV, and reduce the empty travel time of the IGV. Simulation experiments show that flexible task allocation and charging strategies can better adapt to the complex and dynamic operating environment of the ACT.</div></div>\",\"PeriodicalId\":10542,\"journal\":{\"name\":\"Computers & Operations Research\",\"volume\":\"183 \",\"pages\":\"Article 107189\"},\"PeriodicalIF\":4.1000,\"publicationDate\":\"2025-06-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Operations Research\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0305054825002175\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Operations Research","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0305054825002175","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
A deep reinforcement learning based multi-agent simulation optimization approach for IGV bidirectional task allocation and charging joint scheduling in automated container terminals
Intelligent guided vehicle (IGV) task allocation and charging scheduling at automated container terminal (ACT) are two important operational links that interact with each other. The joint scheduling problem of IGV task allocation and charging aims to improve the operational coherence and efficiency of the transportation system. In the parallelly arranged ACT, since IGV needs to enter the yard for side loading operations, the task allocation will greatly affect the empty travel distance and power consumption of IGV. In addition, the dual-cycling mode and the changes in power consumption rate under different IGV operation states make the above joint scheduling problem more complicated. In order to solve this problem, this paper uses the Markov decision process to characterize the IGV bidirectional task allocation and charging joint scheduling problem, and designs a multi-agent simulation optimization method based on deep reinforcement learning to generate a real-time adaptive scheduling solution. Considering the comprehensive impact of the agent’s long-term and short-term goals, a reward function is designed, and the deep neural network training is used to guide the IGV agent to choose the action with the largest expected cumulative reward. In addition, an adaptive double threshold charging strategy is designed, under which IGV can flexibly select the minimum charging threshold according to the system status. Finally, a multi-agent fine-grained simulation model is constructed to verify the effectiveness of the proposed method. Through comparative experiments with three single heuristic scheduling rules, different reinforcement learning algorithms and a fixed single-threshold charging strategy, it is proved that the new method can improve the operating efficiency of the system and the battery utilization of the IGV, and reduce the empty travel time of the IGV. Simulation experiments show that flexible task allocation and charging strategies can better adapt to the complex and dynamic operating environment of the ACT.
期刊介绍:
Operations research and computers meet in a large number of scientific fields, many of which are of vital current concern to our troubled society. These include, among others, ecology, transportation, safety, reliability, urban planning, economics, inventory control, investment strategy and logistics (including reverse logistics). Computers & Operations Research provides an international forum for the application of computers and operations research techniques to problems in these and related fields.