基于深度强化学习的自动化集装箱码头IGV双向任务分配与收费联合调度多智能体仿真优化方法

IF 4.1 2区工程技术 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Computers & Operations Research Pub Date : 2025-06-21 DOI:10.1016/j.cor.2025.107189

Caiyun Yang, Yu Zhang, Junjie Wang, Lijun He, Han Wu

{"title":"基于深度强化学习的自动化集装箱码头IGV双向任务分配与收费联合调度多智能体仿真优化方法","authors":"Caiyun Yang, Yu Zhang, Junjie Wang, Lijun He, Han Wu","doi":"10.1016/j.cor.2025.107189","DOIUrl":null,"url":null,"abstract":"<div><div>Intelligent guided vehicle (IGV) task allocation and charging scheduling at automated container terminal (ACT) are two important operational links that interact with each other. The joint scheduling problem of IGV task allocation and charging aims to improve the operational coherence and efficiency of the transportation system. In the parallelly arranged ACT, since IGV needs to enter the yard for side loading operations, the task allocation will greatly affect the empty travel distance and power consumption of IGV. In addition, the dual-cycling mode and the changes in power consumption rate under different IGV operation states make the above joint scheduling problem more complicated. In order to solve this problem, this paper uses the Markov decision process to characterize the IGV bidirectional task allocation and charging joint scheduling problem, and designs a multi-agent simulation optimization method based on deep reinforcement learning to generate a real-time adaptive scheduling solution. Considering the comprehensive impact of the agent’s long-term and short-term goals, a reward function is designed, and the deep neural network training is used to guide the IGV agent to choose the action with the largest expected cumulative reward. In addition, an adaptive double threshold charging strategy is designed, under which IGV can flexibly select the minimum charging threshold according to the system status. Finally, a multi-agent fine-grained simulation model is constructed to verify the effectiveness of the proposed method. Through comparative experiments with three single heuristic scheduling rules, different reinforcement learning algorithms and a fixed single-threshold charging strategy, it is proved that the new method can improve the operating efficiency of the system and the battery utilization of the IGV, and reduce the empty travel time of the IGV. Simulation experiments show that flexible task allocation and charging strategies can better adapt to the complex and dynamic operating environment of the ACT.</div></div>","PeriodicalId":10542,"journal":{"name":"Computers & Operations Research","volume":"183 ","pages":"Article 107189"},"PeriodicalIF":4.1000,"publicationDate":"2025-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A deep reinforcement learning based multi-agent simulation optimization approach for IGV bidirectional task allocation and charging joint scheduling in automated container terminals\",\"authors\":\"Caiyun Yang, Yu Zhang, Junjie Wang, Lijun He, Han Wu\",\"doi\":\"10.1016/j.cor.2025.107189\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Intelligent guided vehicle (IGV) task allocation and charging scheduling at automated container terminal (ACT) are two important operational links that interact with each other. The joint scheduling problem of IGV task allocation and charging aims to improve the operational coherence and efficiency of the transportation system. In the parallelly arranged ACT, since IGV needs to enter the yard for side loading operations, the task allocation will greatly affect the empty travel distance and power consumption of IGV. In addition, the dual-cycling mode and the changes in power consumption rate under different IGV operation states make the above joint scheduling problem more complicated. In order to solve this problem, this paper uses the Markov decision process to characterize the IGV bidirectional task allocation and charging joint scheduling problem, and designs a multi-agent simulation optimization method based on deep reinforcement learning to generate a real-time adaptive scheduling solution. Considering the comprehensive impact of the agent’s long-term and short-term goals, a reward function is designed, and the deep neural network training is used to guide the IGV agent to choose the action with the largest expected cumulative reward. In addition, an adaptive double threshold charging strategy is designed, under which IGV can flexibly select the minimum charging threshold according to the system status. Finally, a multi-agent fine-grained simulation model is constructed to verify the effectiveness of the proposed method. Through comparative experiments with three single heuristic scheduling rules, different reinforcement learning algorithms and a fixed single-threshold charging strategy, it is proved that the new method can improve the operating efficiency of the system and the battery utilization of the IGV, and reduce the empty travel time of the IGV. Simulation experiments show that flexible task allocation and charging strategies can better adapt to the complex and dynamic operating environment of the ACT.</div></div>\",\"PeriodicalId\":10542,\"journal\":{\"name\":\"Computers & Operations Research\",\"volume\":\"183 \",\"pages\":\"Article 107189\"},\"PeriodicalIF\":4.1000,\"publicationDate\":\"2025-06-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Operations Research\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0305054825002175\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Operations Research","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0305054825002175","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

摘要

智能导引车（IGV）任务分配与集装箱自动化码头收费调度是相互影响的两个重要操作环节。IGV任务分配和收费的联合调度问题旨在提高交通运输系统的运行一致性和效率。在并联布置的ACT中，由于IGV需要进入堆场进行侧装作业，任务分配会对IGV的空行距离和功耗产生较大影响。此外，双循环模式和不同IGV运行状态下功耗率的变化使得上述联合调度问题更加复杂。为了解决这一问题，本文利用马尔可夫决策过程对IGV双向任务分配和充电联合调度问题进行表征，设计了一种基于深度强化学习的多智能体仿真优化方法，生成实时自适应调度方案。考虑到智能体长期目标和短期目标的综合影响，设计了奖励函数，利用深度神经网络训练引导IGV智能体选择期望累积奖励最大的行为。此外，设计了自适应双阈值充电策略，使IGV能够根据系统状态灵活选择最小充电阈值。最后，构建了多智能体细粒度仿真模型，验证了该方法的有效性。通过与三种单一启发式调度规则、不同强化学习算法和固定单阈值充电策略的对比实验，证明新方法可以提高系统的运行效率和IGV的电池利用率，并缩短IGV的空行时间。仿真实验表明，灵活的任务分配和收费策略能更好地适应ACT复杂动态的运行环境。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A deep reinforcement learning based multi-agent simulation optimization approach for IGV bidirectional task allocation and charging joint scheduling in automated container terminals

Intelligent guided vehicle (IGV) task allocation and charging scheduling at automated container terminal (ACT) are two important operational links that interact with each other. The joint scheduling problem of IGV task allocation and charging aims to improve the operational coherence and efficiency of the transportation system. In the parallelly arranged ACT, since IGV needs to enter the yard for side loading operations, the task allocation will greatly affect the empty travel distance and power consumption of IGV. In addition, the dual-cycling mode and the changes in power consumption rate under different IGV operation states make the above joint scheduling problem more complicated. In order to solve this problem, this paper uses the Markov decision process to characterize the IGV bidirectional task allocation and charging joint scheduling problem, and designs a multi-agent simulation optimization method based on deep reinforcement learning to generate a real-time adaptive scheduling solution. Considering the comprehensive impact of the agent’s long-term and short-term goals, a reward function is designed, and the deep neural network training is used to guide the IGV agent to choose the action with the largest expected cumulative reward. In addition, an adaptive double threshold charging strategy is designed, under which IGV can flexibly select the minimum charging threshold according to the system status. Finally, a multi-agent fine-grained simulation model is constructed to verify the effectiveness of the proposed method. Through comparative experiments with three single heuristic scheduling rules, different reinforcement learning algorithms and a fixed single-threshold charging strategy, it is proved that the new method can improve the operating efficiency of the system and the battery utilization of the IGV, and reduce the empty travel time of the IGV. Simulation experiments show that flexible task allocation and charging strategies can better adapt to the complex and dynamic operating environment of the ACT.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computers & Operations Research 工程技术-工程：工业

CiteScore

8.60

自引率

8.70%

发文量

292

审稿时长

8.5 months

期刊介绍： Operations research and computers meet in a large number of scientific fields, many of which are of vital current concern to our troubled society. These include, among others, ecology, transportation, safety, reliability, urban planning, economics, inventory control, investment strategy and logistics (including reverse logistics). Computers & Operations Research provides an international forum for the application of computers and operations research techniques to problems in these and related fields.