物流走廊的容量规划:动态随机时间仓包装问题的深度强化学习

IF 8.3 1区 工程技术 Q1 ECONOMICS
{"title":"物流走廊的容量规划:动态随机时间仓包装问题的深度强化学习","authors":"","doi":"10.1016/j.tre.2024.103742","DOIUrl":null,"url":null,"abstract":"<div><p>This paper addresses the challenge of managing uncertainty in the daily capacity planning of a terminal in a corridor-based logistics system. Corridor-based logistics systems facilitate the exchange of freight between two distinct regions, usually involving industrial and logistics clusters. In this context, we introduce the dynamic stochastic temporal bin packing problem. It models the assignment of individual containers to carriers’ trucks over discrete time units in real-time. We formulate it as a Markov decision process (MDP). Two distinguishing characteristics of our problem are the stochastic nature of the time-dependent availability of containers, i.e., container <em>delays</em>, and the continuous-time, or <em>dynamic</em>, aspect of the planning, where a container announcement may occur at any time moment during the planning horizon. We introduce an innovative real-time planning algorithm based on Proximal Policy Optimization (PPO), a Deep Reinforcement Learning (DRL) method, to allocate individual containers to eligible carriers in real-time. In addition, we propose some practical heuristics and two novel rolling-horizon batch-planning methods based on (stochastic) mixed-integer programming (MIP), which can be interpreted as computational information relaxation bounds because they delay decision making. The results show that our proposed DRL method outperforms the practical heuristics and effectively scales to larger-sized problems as opposed to the stochastic MIP-based approach, making our DRL method a practically appealing solution.</p></div>","PeriodicalId":49418,"journal":{"name":"Transportation Research Part E-Logistics and Transportation Review","volume":null,"pages":null},"PeriodicalIF":8.3000,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1366554524003338/pdfft?md5=772c954521a957892fdb831dda89545d&pid=1-s2.0-S1366554524003338-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Capacity planning in logistics corridors: Deep reinforcement learning for the dynamic stochastic temporal bin packing problem\",\"authors\":\"\",\"doi\":\"10.1016/j.tre.2024.103742\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>This paper addresses the challenge of managing uncertainty in the daily capacity planning of a terminal in a corridor-based logistics system. Corridor-based logistics systems facilitate the exchange of freight between two distinct regions, usually involving industrial and logistics clusters. In this context, we introduce the dynamic stochastic temporal bin packing problem. It models the assignment of individual containers to carriers’ trucks over discrete time units in real-time. We formulate it as a Markov decision process (MDP). Two distinguishing characteristics of our problem are the stochastic nature of the time-dependent availability of containers, i.e., container <em>delays</em>, and the continuous-time, or <em>dynamic</em>, aspect of the planning, where a container announcement may occur at any time moment during the planning horizon. We introduce an innovative real-time planning algorithm based on Proximal Policy Optimization (PPO), a Deep Reinforcement Learning (DRL) method, to allocate individual containers to eligible carriers in real-time. In addition, we propose some practical heuristics and two novel rolling-horizon batch-planning methods based on (stochastic) mixed-integer programming (MIP), which can be interpreted as computational information relaxation bounds because they delay decision making. The results show that our proposed DRL method outperforms the practical heuristics and effectively scales to larger-sized problems as opposed to the stochastic MIP-based approach, making our DRL method a practically appealing solution.</p></div>\",\"PeriodicalId\":49418,\"journal\":{\"name\":\"Transportation Research Part E-Logistics and Transportation Review\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":8.3000,\"publicationDate\":\"2024-08-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S1366554524003338/pdfft?md5=772c954521a957892fdb831dda89545d&pid=1-s2.0-S1366554524003338-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Transportation Research Part E-Logistics and Transportation Review\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1366554524003338\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ECONOMICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transportation Research Part E-Logistics and Transportation Review","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1366554524003338","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECONOMICS","Score":null,"Total":0}
引用次数: 0

摘要

本文探讨了在基于走廊的物流系统中,如何管理码头日常运力规划中的不确定性。基于走廊的物流系统促进了两个不同区域之间的货物交换,通常涉及工业和物流集群。在这种情况下,我们引入了动态随机时间仓包装问题。它模拟了在离散时间单位内将单个集装箱实时分配给承运商卡车的过程。我们将其表述为马尔可夫决策过程(MDP)。我们的问题有两个显著特点:一是与时间相关的集装箱可用性的随机性,即集装箱延迟;二是规划的连续时间或动态性,即在规划范围内的任何时刻都可能发生集装箱公告。我们介绍了一种基于深度强化学习(DRL)方法近端策略优化(PPO)的创新型实时规划算法,用于将单个集装箱实时分配给符合条件的承运商。此外,我们还提出了一些实用的启发式方法和两种基于(随机)混合整数编程(MIP)的新型滚动视距批量规划方法,这些方法可以被解释为计算信息松弛边界,因为它们会延迟决策。结果表明,与基于随机 MIP 的方法相比,我们提出的 DRL 方法优于实用启发式方法,并能有效地扩展到更大规模的问题,使我们的 DRL 方法成为一种具有实际吸引力的解决方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Capacity planning in logistics corridors: Deep reinforcement learning for the dynamic stochastic temporal bin packing problem

This paper addresses the challenge of managing uncertainty in the daily capacity planning of a terminal in a corridor-based logistics system. Corridor-based logistics systems facilitate the exchange of freight between two distinct regions, usually involving industrial and logistics clusters. In this context, we introduce the dynamic stochastic temporal bin packing problem. It models the assignment of individual containers to carriers’ trucks over discrete time units in real-time. We formulate it as a Markov decision process (MDP). Two distinguishing characteristics of our problem are the stochastic nature of the time-dependent availability of containers, i.e., container delays, and the continuous-time, or dynamic, aspect of the planning, where a container announcement may occur at any time moment during the planning horizon. We introduce an innovative real-time planning algorithm based on Proximal Policy Optimization (PPO), a Deep Reinforcement Learning (DRL) method, to allocate individual containers to eligible carriers in real-time. In addition, we propose some practical heuristics and two novel rolling-horizon batch-planning methods based on (stochastic) mixed-integer programming (MIP), which can be interpreted as computational information relaxation bounds because they delay decision making. The results show that our proposed DRL method outperforms the practical heuristics and effectively scales to larger-sized problems as opposed to the stochastic MIP-based approach, making our DRL method a practically appealing solution.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
16.20
自引率
16.00%
发文量
285
审稿时长
62 days
期刊介绍: Transportation Research Part E: Logistics and Transportation Review is a reputable journal that publishes high-quality articles covering a wide range of topics in the field of logistics and transportation research. The journal welcomes submissions on various subjects, including transport economics, transport infrastructure and investment appraisal, evaluation of public policies related to transportation, empirical and analytical studies of logistics management practices and performance, logistics and operations models, and logistics and supply chain management. Part E aims to provide informative and well-researched articles that contribute to the understanding and advancement of the field. The content of the journal is complementary to other prestigious journals in transportation research, such as Transportation Research Part A: Policy and Practice, Part B: Methodological, Part C: Emerging Technologies, Part D: Transport and Environment, and Part F: Traffic Psychology and Behaviour. Together, these journals form a comprehensive and cohesive reference for current research in transportation science.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信