深度强化学习用于实时库存的货架存储分配和补货问题

IF 6 2区管理学 Q1 OPERATIONS RESEARCH & MANAGEMENT SCIENCE

European Journal of Operational Research Pub Date : 2025-05-09 DOI:10.1016/j.ejor.2025.05.008

Sander Teck, Tú San Phạm, Louis-Martin Rousseau, Pieter Vansteenwegen

{"title":"深度强化学习用于实时库存的货架存储分配和补货问题","authors":"Sander Teck, Tú San Phạm, Louis-Martin Rousseau, Pieter Vansteenwegen","doi":"10.1016/j.ejor.2025.05.008","DOIUrl":null,"url":null,"abstract":"The e-commerce industry is quickly transforming towards more automation and technological advancements. With the growing intricacy of warehouse operations, there is a need for control systems that can efficiently handle this complexity. This study considers a Robotic Mobile Fulfillment System (RMFS), a semi-automated warehousing system. This system employs autonomous mobile robots (AMRs) to retrieve inventory racks from the storage area; this way, human activity is eliminated within the storage area itself. The fleet of robots both store and retrieve the inventory racks to either workstations, where human pickers are stationed that pick items from the racks, or replenishment stations, where depleted inventory racks can be restocked with items. An attractive characteristic of the RMFS is that it dynamically changes the positioning of the inventory racks based on the frequency of inventory rack requests and the state of their stock levels. The optimization objective considered in this study for the dynamic positioning problem of the racks within the storage area is to minimize the average cycle time of the mobile robots to perform retrieval and replenishment activities. We propose a deep reinforcement learning approach to train a decision-making agent to learn a policy for the storage assignment and replenishment of inventory racks. The learned policy is compared to the commonly used decision rules in the academic literature on this problem. The experimental results show the potential benefits of training an agent to learn a storage and replenishment policy. Cycle time improvements up to 5.4 % can be achieved over the best-performing decision rules. This research contributes to advancing the understanding of intelligent storage assignment and replenishment strategies for the real-time decision-making process within an RMFS.","PeriodicalId":55161,"journal":{"name":"European Journal of Operational Research","volume":"47 1","pages":""},"PeriodicalIF":6.0000,"publicationDate":"2025-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep reinforcement learning for the real-time inventory rack storage assignment and replenishment problem\",\"authors\":\"Sander Teck, Tú San Phạm, Louis-Martin Rousseau, Pieter Vansteenwegen\",\"doi\":\"10.1016/j.ejor.2025.05.008\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The e-commerce industry is quickly transforming towards more automation and technological advancements. With the growing intricacy of warehouse operations, there is a need for control systems that can efficiently handle this complexity. This study considers a Robotic Mobile Fulfillment System (RMFS), a semi-automated warehousing system. This system employs autonomous mobile robots (AMRs) to retrieve inventory racks from the storage area; this way, human activity is eliminated within the storage area itself. The fleet of robots both store and retrieve the inventory racks to either workstations, where human pickers are stationed that pick items from the racks, or replenishment stations, where depleted inventory racks can be restocked with items. An attractive characteristic of the RMFS is that it dynamically changes the positioning of the inventory racks based on the frequency of inventory rack requests and the state of their stock levels. The optimization objective considered in this study for the dynamic positioning problem of the racks within the storage area is to minimize the average cycle time of the mobile robots to perform retrieval and replenishment activities. We propose a deep reinforcement learning approach to train a decision-making agent to learn a policy for the storage assignment and replenishment of inventory racks. The learned policy is compared to the commonly used decision rules in the academic literature on this problem. The experimental results show the potential benefits of training an agent to learn a storage and replenishment policy. Cycle time improvements up to 5.4 % can be achieved over the best-performing decision rules. This research contributes to advancing the understanding of intelligent storage assignment and replenishment strategies for the real-time decision-making process within an RMFS.\",\"PeriodicalId\":55161,\"journal\":{\"name\":\"European Journal of Operational Research\",\"volume\":\"47 1\",\"pages\":\"\"},\"PeriodicalIF\":6.0000,\"publicationDate\":\"2025-05-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"European Journal of Operational Research\",\"FirstCategoryId\":\"91\",\"ListUrlMain\":\"https://doi.org/10.1016/j.ejor.2025.05.008\",\"RegionNum\":2,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"OPERATIONS RESEARCH & MANAGEMENT SCIENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Operational Research","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.1016/j.ejor.2025.05.008","RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OPERATIONS RESEARCH & MANAGEMENT SCIENCE","Score":null,"Total":0}

引用次数: 0

摘要

电子商务行业正在迅速向更加自动化和技术进步的方向转变。随着仓库操作的日益复杂，需要能够有效处理这种复杂性的控制系统。本研究考虑了机器人移动履行系统（RMFS），一种半自动仓储系统。该系统采用自主移动机器人（amr）从存储区域检索库存货架；这样，在存储区域内就消除了人类活动。机器人车队既可以将库存货架存储到工作站，也可以将其检索到，工作站会派人从货架上挑选物品，或者补给站，在补给站，耗尽的库存货架可以重新补充物品。RMFS的一个吸引人的特点是，它根据库存机架请求的频率和库存水平的状态动态地改变库存机架的位置。对于仓储区域内货架的动态定位问题，本研究考虑的优化目标是使移动机器人进行取货和补货活动的平均周期时间最小。我们提出了一种深度强化学习方法来训练决策代理学习库存货架的存储分配和补充策略。将学习到的策略与学术文献中常用的决策规则进行了比较。实验结果表明，训练智能体学习存储和补货策略具有潜在的好处。在最佳执行的决策规则上，可以实现高达5.4%的周期时间改进。本研究有助于促进对RMFS中实时决策过程中的智能存储分配和补货策略的理解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Deep reinforcement learning for the real-time inventory rack storage assignment and replenishment problem

The e-commerce industry is quickly transforming towards more automation and technological advancements. With the growing intricacy of warehouse operations, there is a need for control systems that can efficiently handle this complexity. This study considers a Robotic Mobile Fulfillment System (RMFS), a semi-automated warehousing system. This system employs autonomous mobile robots (AMRs) to retrieve inventory racks from the storage area; this way, human activity is eliminated within the storage area itself. The fleet of robots both store and retrieve the inventory racks to either workstations, where human pickers are stationed that pick items from the racks, or replenishment stations, where depleted inventory racks can be restocked with items. An attractive characteristic of the RMFS is that it dynamically changes the positioning of the inventory racks based on the frequency of inventory rack requests and the state of their stock levels. The optimization objective considered in this study for the dynamic positioning problem of the racks within the storage area is to minimize the average cycle time of the mobile robots to perform retrieval and replenishment activities. We propose a deep reinforcement learning approach to train a decision-making agent to learn a policy for the storage assignment and replenishment of inventory racks. The learned policy is compared to the commonly used decision rules in the academic literature on this problem. The experimental results show the potential benefits of training an agent to learn a storage and replenishment policy. Cycle time improvements up to 5.4 % can be achieved over the best-performing decision rules. This research contributes to advancing the understanding of intelligent storage assignment and replenishment strategies for the real-time decision-making process within an RMFS.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

European Journal of Operational Research 管理科学-运筹学与管理科学

CiteScore

11.90

自引率

9.40%

发文量

786

审稿时长

8.2 months

期刊介绍： The European Journal of Operational Research (EJOR) publishes high quality, original papers that contribute to the methodology of operational research (OR) and to the practice of decision making.