仓库操作中动态订单拣选的深度强化学习

IF 4.1 2区 工程技术 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Sasan Mahmoudinazlou , Abhay Sobhanan , Hadi Charkhgard , Ali Eshragh , George Dunn
{"title":"仓库操作中动态订单拣选的深度强化学习","authors":"Sasan Mahmoudinazlou ,&nbsp;Abhay Sobhanan ,&nbsp;Hadi Charkhgard ,&nbsp;Ali Eshragh ,&nbsp;George Dunn","doi":"10.1016/j.cor.2025.107112","DOIUrl":null,"url":null,"abstract":"<div><div>Order picking is a pivotal operation in warehouses that directly impacts overall efficiency and profitability. This study addresses the dynamic order picking problem, a significant concern in modern warehouse management, where real-time adaptation to fluctuating order arrivals and efficient picker routing are crucial. Traditional methods, which often depend on static optimization algorithms designed around fixed order sets for the picker routing, fall short in addressing the challenges of this dynamic environment. To overcome these challenges, we propose a Deep Reinforcement Learning (DRL) framework tailored for single-block warehouses equipped with an autonomous picking device. By dynamically optimizing picker routes, our approach significantly reduces order throughput times and unfulfilled orders, particularly under high order arrival rates. We benchmark our DRL model against established algorithms, utilizing instances generated based on standard practices in the order picking literature. Experimental results demonstrate the superiority of our DRL model over benchmark algorithms. For example, at a high order arrival rate of 0.09 (i.e., 9 orders per 100 units of time on average), our approach achieves an order fulfillment rate of approximately 98%, compared to the 82% fulfillment rate observed with benchmarking algorithms. We further investigate the integration of a hyperparameter in the reward function that allows for flexible balancing between distance traveled and order completion time. Finally, we demonstrate the robustness of our DRL model on out-of-sample test instances.</div></div>","PeriodicalId":10542,"journal":{"name":"Computers & Operations Research","volume":"182 ","pages":"Article 107112"},"PeriodicalIF":4.1000,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep reinforcement learning for dynamic order picking in warehouse operations\",\"authors\":\"Sasan Mahmoudinazlou ,&nbsp;Abhay Sobhanan ,&nbsp;Hadi Charkhgard ,&nbsp;Ali Eshragh ,&nbsp;George Dunn\",\"doi\":\"10.1016/j.cor.2025.107112\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Order picking is a pivotal operation in warehouses that directly impacts overall efficiency and profitability. This study addresses the dynamic order picking problem, a significant concern in modern warehouse management, where real-time adaptation to fluctuating order arrivals and efficient picker routing are crucial. Traditional methods, which often depend on static optimization algorithms designed around fixed order sets for the picker routing, fall short in addressing the challenges of this dynamic environment. To overcome these challenges, we propose a Deep Reinforcement Learning (DRL) framework tailored for single-block warehouses equipped with an autonomous picking device. By dynamically optimizing picker routes, our approach significantly reduces order throughput times and unfulfilled orders, particularly under high order arrival rates. We benchmark our DRL model against established algorithms, utilizing instances generated based on standard practices in the order picking literature. Experimental results demonstrate the superiority of our DRL model over benchmark algorithms. For example, at a high order arrival rate of 0.09 (i.e., 9 orders per 100 units of time on average), our approach achieves an order fulfillment rate of approximately 98%, compared to the 82% fulfillment rate observed with benchmarking algorithms. We further investigate the integration of a hyperparameter in the reward function that allows for flexible balancing between distance traveled and order completion time. Finally, we demonstrate the robustness of our DRL model on out-of-sample test instances.</div></div>\",\"PeriodicalId\":10542,\"journal\":{\"name\":\"Computers & Operations Research\",\"volume\":\"182 \",\"pages\":\"Article 107112\"},\"PeriodicalIF\":4.1000,\"publicationDate\":\"2025-05-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Operations Research\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0305054825001406\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Operations Research","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0305054825001406","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

摘要

拣货是仓库的关键操作,直接影响整体效率和盈利能力。本研究解决了动态订单拣选问题,这是现代仓库管理中一个重要的问题,其中实时适应波动的订单到达和高效的拣选路线是至关重要的。传统的方法通常依赖于围绕固定顺序集设计的静态优化算法,无法解决这种动态环境的挑战。为了克服这些挑战,我们提出了一个深度强化学习(DRL)框架,专门为配备自主拣选设备的单块仓库量身定制。通过动态优化拣选路线,我们的方法显著减少了订单吞吐量时间和未完成订单,特别是在高订单到达率的情况下。我们根据已建立的算法对我们的DRL模型进行基准测试,利用基于顺序选择文献中的标准实践生成的实例。实验结果表明,我们的DRL模型优于基准算法。例如,在0.09的高订单到达率(即平均每100单位时间9个订单)下,我们的方法实现了约98%的订单完成率,而基准算法的完成率为82%。我们进一步研究了奖励函数中一个超参数的积分,它允许在旅行距离和订单完成时间之间实现灵活的平衡。最后,我们证明了我们的DRL模型在样本外测试实例上的鲁棒性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Deep reinforcement learning for dynamic order picking in warehouse operations
Order picking is a pivotal operation in warehouses that directly impacts overall efficiency and profitability. This study addresses the dynamic order picking problem, a significant concern in modern warehouse management, where real-time adaptation to fluctuating order arrivals and efficient picker routing are crucial. Traditional methods, which often depend on static optimization algorithms designed around fixed order sets for the picker routing, fall short in addressing the challenges of this dynamic environment. To overcome these challenges, we propose a Deep Reinforcement Learning (DRL) framework tailored for single-block warehouses equipped with an autonomous picking device. By dynamically optimizing picker routes, our approach significantly reduces order throughput times and unfulfilled orders, particularly under high order arrival rates. We benchmark our DRL model against established algorithms, utilizing instances generated based on standard practices in the order picking literature. Experimental results demonstrate the superiority of our DRL model over benchmark algorithms. For example, at a high order arrival rate of 0.09 (i.e., 9 orders per 100 units of time on average), our approach achieves an order fulfillment rate of approximately 98%, compared to the 82% fulfillment rate observed with benchmarking algorithms. We further investigate the integration of a hyperparameter in the reward function that allows for flexible balancing between distance traveled and order completion time. Finally, we demonstrate the robustness of our DRL model on out-of-sample test instances.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computers & Operations Research
Computers & Operations Research 工程技术-工程:工业
CiteScore
8.60
自引率
8.70%
发文量
292
审稿时长
8.5 months
期刊介绍: Operations research and computers meet in a large number of scientific fields, many of which are of vital current concern to our troubled society. These include, among others, ecology, transportation, safety, reliability, urban planning, economics, inventory control, investment strategy and logistics (including reverse logistics). Computers & Operations Research provides an international forum for the application of computers and operations research techniques to problems in these and related fields.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信