一种基于策略的蒙特卡罗树搜索方法用于容器预编组

IF 7 2区 工程技术 Q1 ENGINEERING, INDUSTRIAL
Ziliang Wang, Chenhao Zhou, Ada Che, Jingkun Gao
{"title":"一种基于策略的蒙特卡罗树搜索方法用于容器预编组","authors":"Ziliang Wang, Chenhao Zhou, Ada Che, Jingkun Gao","doi":"10.1080/00207543.2023.2279130","DOIUrl":null,"url":null,"abstract":"AbstractThe container pre-marshalling problem (CPMP) aims to minimise the number of reshuffling moves, ultimately achieving an optimised stacking arrangement in each bay based on the priority of containers during the non-loading phase. Given the sequential decision nature, we formulated the CPMP as a Markov decision process (MDP) model to account for the specific state and action of the reshuffling process. To address the challenge that the relocated container may trigger a chain effect on the subsequent reshuffling moves, this paper develops an improved policy-based Monte Carlo tree search (P-MCTS) to solve the CPMP, where eight composite reshuffling rules and modified upper confidence bounds are employed in the selection phases, and a well-designed heuristic algorithm is utilised in the simulation phases. Meanwhile, considering the effectiveness of reinforcement learning methods for solving the MDP model, an improved Q-learning is proposed as the compared method. Numerical results show that the P-MCTS outperforms all compared methods in scenarios where all containers have different priorities and scenarios where containers can share the same priority.KEYWORDS: Container pre-marshalling problemMonte Carlo tree searchMarkov decision processQ-learning algorithmAutomated container terminal AcknowledgementThis research was made possible with funding support from National Natural Science Foundation of China [72101203, 71871183], Shaanxi Provincial Key R&D Program, China [2022KW-02], and China Scholarship Council [grant number 202206290124].Disclosure statementNo potential conflict of interest was reported by the author(s).Data availability statementData sharing not applicable – no new data generated.Additional informationFundingThis work was supported by National Natural Science Foundation of China: [Grant Number 72101203, 71871183]; China Scholarship Council: [Grant Number 202206290124]; Shaanxi Provincial Key R&D Program, China: [Grant Number 2022KW-02].Notes on contributorsZiliang WangMr. Ziliang Wang, is a Doctoral student from School of Management in Northwestern Polytechnical University.Chenhao ZhouDr. Chenhao Zhou, is a Professor from School of Management in Northwestern Polytechnical University. Prior to this, he was a Research Assistant Professor in the Department of Industrial Systems Engineering and Management, National University of Singapore. His research interests are transportation systems and maritime logistics using simulation and optimization methods.Ada CheDr. Ada Che, is a Professor from School of Management in Northwestern Polytechnical University. He received the B.S. and Ph.D. degrees in Mechanical Engineering from Xi’an Jiaotong University in 1994 and 1999, respectively. Since 2005, he has been a Professor in School of Management in Northwestern Polytechnical University. His current research interests include transportation planning and optimisation, production scheduling, and operations research.Jingkun GaoMr. Jingkun Gao, is currently an Engineer with Northwest Electric Power Design Institute Co., Ltd. of China Power Engineering Consulting Group and he received the Master’s degree from School of Management in Northwestern Polytechnical University.","PeriodicalId":14307,"journal":{"name":"International Journal of Production Research","volume":"273 9‐13","pages":"0"},"PeriodicalIF":7.0000,"publicationDate":"2023-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A policy-based Monte Carlo tree search method for container pre-marshalling\",\"authors\":\"Ziliang Wang, Chenhao Zhou, Ada Che, Jingkun Gao\",\"doi\":\"10.1080/00207543.2023.2279130\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"AbstractThe container pre-marshalling problem (CPMP) aims to minimise the number of reshuffling moves, ultimately achieving an optimised stacking arrangement in each bay based on the priority of containers during the non-loading phase. Given the sequential decision nature, we formulated the CPMP as a Markov decision process (MDP) model to account for the specific state and action of the reshuffling process. To address the challenge that the relocated container may trigger a chain effect on the subsequent reshuffling moves, this paper develops an improved policy-based Monte Carlo tree search (P-MCTS) to solve the CPMP, where eight composite reshuffling rules and modified upper confidence bounds are employed in the selection phases, and a well-designed heuristic algorithm is utilised in the simulation phases. Meanwhile, considering the effectiveness of reinforcement learning methods for solving the MDP model, an improved Q-learning is proposed as the compared method. Numerical results show that the P-MCTS outperforms all compared methods in scenarios where all containers have different priorities and scenarios where containers can share the same priority.KEYWORDS: Container pre-marshalling problemMonte Carlo tree searchMarkov decision processQ-learning algorithmAutomated container terminal AcknowledgementThis research was made possible with funding support from National Natural Science Foundation of China [72101203, 71871183], Shaanxi Provincial Key R&D Program, China [2022KW-02], and China Scholarship Council [grant number 202206290124].Disclosure statementNo potential conflict of interest was reported by the author(s).Data availability statementData sharing not applicable – no new data generated.Additional informationFundingThis work was supported by National Natural Science Foundation of China: [Grant Number 72101203, 71871183]; China Scholarship Council: [Grant Number 202206290124]; Shaanxi Provincial Key R&D Program, China: [Grant Number 2022KW-02].Notes on contributorsZiliang WangMr. Ziliang Wang, is a Doctoral student from School of Management in Northwestern Polytechnical University.Chenhao ZhouDr. Chenhao Zhou, is a Professor from School of Management in Northwestern Polytechnical University. Prior to this, he was a Research Assistant Professor in the Department of Industrial Systems Engineering and Management, National University of Singapore. His research interests are transportation systems and maritime logistics using simulation and optimization methods.Ada CheDr. Ada Che, is a Professor from School of Management in Northwestern Polytechnical University. He received the B.S. and Ph.D. degrees in Mechanical Engineering from Xi’an Jiaotong University in 1994 and 1999, respectively. Since 2005, he has been a Professor in School of Management in Northwestern Polytechnical University. His current research interests include transportation planning and optimisation, production scheduling, and operations research.Jingkun GaoMr. Jingkun Gao, is currently an Engineer with Northwest Electric Power Design Institute Co., Ltd. of China Power Engineering Consulting Group and he received the Master’s degree from School of Management in Northwestern Polytechnical University.\",\"PeriodicalId\":14307,\"journal\":{\"name\":\"International Journal of Production Research\",\"volume\":\"273 9‐13\",\"pages\":\"0\"},\"PeriodicalIF\":7.0000,\"publicationDate\":\"2023-11-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Production Research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1080/00207543.2023.2279130\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, INDUSTRIAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Production Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/00207543.2023.2279130","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, INDUSTRIAL","Score":null,"Total":0}
引用次数: 0

摘要

【摘要】集装箱预编组问题(CPMP)旨在最大限度地减少重新洗牌的次数,最终根据集装箱在非装载阶段的优先级在每个舱内实现优化的堆叠安排。考虑到序列决策的性质,我们将CPMP制定为马尔可夫决策过程(MDP)模型,以解释重组过程的具体状态和行为。为了解决重新安置的容器可能引发后续重组动作连锁效应的挑战,本文开发了一种改进的基于策略的蒙特卡罗树搜索(P-MCTS)来解决CPMP,其中在选择阶段采用8个复合重组规则和修改的上置信度,并在模拟阶段采用精心设计的启发式算法。同时,考虑到强化学习方法求解MDP模型的有效性,提出了一种改进的q -学习方法作为比较方法。数值结果表明,在所有容器具有不同优先级和容器可以共享相同优先级的情况下,P-MCTS方法优于所有比较方法。关键词:集装箱预编组问题蒙特卡罗树搜索马尔可夫决策过程学习算法自动化集装箱码头致谢国家自然科学基金项目[72101203,71871183],陕西省重点研发项目[2022KW-02],中国留学基金委[资助号:202206290124]。披露声明作者未报告潜在的利益冲突。数据可用性声明数据共享不适用-不生成新数据。本研究由国家自然科学基金资助:[资助号:72101203,71871183];中国留学基金委资助项目[资助号:202206290124];陕西省重点科技发展计划项目[批准号2022KW-02]。投稿人:王思良王自亮,西北工业大学管理学院博士生。Chenhao ZhouDr。周晨豪,西北工业大学管理学院教授。在此之前,他是新加坡国立大学工业系统工程与管理系的研究助理教授。主要研究方向为运输系统和海上物流仿真与优化方法。Ada CheDr。车爱达,西北工业大学管理学院教授。他分别于1994年和1999年获得西安交通大学机械工程学士学位和博士学位。2005年起任西北工业大学管理学院教授。他目前的研究兴趣包括运输规划和优化、生产调度和运筹学。Jingkun GaoMr。高敬坤,现任中国电力工程咨询集团西北电力设计院有限公司工程师,西北工业大学管理学院硕士学位。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A policy-based Monte Carlo tree search method for container pre-marshalling
AbstractThe container pre-marshalling problem (CPMP) aims to minimise the number of reshuffling moves, ultimately achieving an optimised stacking arrangement in each bay based on the priority of containers during the non-loading phase. Given the sequential decision nature, we formulated the CPMP as a Markov decision process (MDP) model to account for the specific state and action of the reshuffling process. To address the challenge that the relocated container may trigger a chain effect on the subsequent reshuffling moves, this paper develops an improved policy-based Monte Carlo tree search (P-MCTS) to solve the CPMP, where eight composite reshuffling rules and modified upper confidence bounds are employed in the selection phases, and a well-designed heuristic algorithm is utilised in the simulation phases. Meanwhile, considering the effectiveness of reinforcement learning methods for solving the MDP model, an improved Q-learning is proposed as the compared method. Numerical results show that the P-MCTS outperforms all compared methods in scenarios where all containers have different priorities and scenarios where containers can share the same priority.KEYWORDS: Container pre-marshalling problemMonte Carlo tree searchMarkov decision processQ-learning algorithmAutomated container terminal AcknowledgementThis research was made possible with funding support from National Natural Science Foundation of China [72101203, 71871183], Shaanxi Provincial Key R&D Program, China [2022KW-02], and China Scholarship Council [grant number 202206290124].Disclosure statementNo potential conflict of interest was reported by the author(s).Data availability statementData sharing not applicable – no new data generated.Additional informationFundingThis work was supported by National Natural Science Foundation of China: [Grant Number 72101203, 71871183]; China Scholarship Council: [Grant Number 202206290124]; Shaanxi Provincial Key R&D Program, China: [Grant Number 2022KW-02].Notes on contributorsZiliang WangMr. Ziliang Wang, is a Doctoral student from School of Management in Northwestern Polytechnical University.Chenhao ZhouDr. Chenhao Zhou, is a Professor from School of Management in Northwestern Polytechnical University. Prior to this, he was a Research Assistant Professor in the Department of Industrial Systems Engineering and Management, National University of Singapore. His research interests are transportation systems and maritime logistics using simulation and optimization methods.Ada CheDr. Ada Che, is a Professor from School of Management in Northwestern Polytechnical University. He received the B.S. and Ph.D. degrees in Mechanical Engineering from Xi’an Jiaotong University in 1994 and 1999, respectively. Since 2005, he has been a Professor in School of Management in Northwestern Polytechnical University. His current research interests include transportation planning and optimisation, production scheduling, and operations research.Jingkun GaoMr. Jingkun Gao, is currently an Engineer with Northwest Electric Power Design Institute Co., Ltd. of China Power Engineering Consulting Group and he received the Master’s degree from School of Management in Northwestern Polytechnical University.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
International Journal of Production Research
International Journal of Production Research 管理科学-工程:工业
CiteScore
19.20
自引率
14.10%
发文量
318
审稿时长
6.3 months
期刊介绍: The International Journal of Production Research (IJPR), published since 1961, is a well-established, highly successful and leading journal reporting manufacturing, production and operations management research. IJPR is published 24 times a year and includes papers on innovation management, design of products, manufacturing processes, production and logistics systems. Production economics, the essential behaviour of production resources and systems as well as the complex decision problems that arise in design, management and control of production and logistics systems are considered. IJPR is a journal for researchers and professors in mechanical engineering, industrial and systems engineering, operations research and management science, and business. It is also an informative reference for industrial managers looking to improve the efficiency and effectiveness of their production systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信