基于dqn的混合“零件到拣货人”拣货系统的拣货和补货综合决策

IF 2.5 4区计算机科学 Q3 TELECOMMUNICATIONS

Transactions on Emerging Telecommunications Technologies Pub Date : 2025-10-02 DOI:10.1002/ett.70277

Xin Wang, Yaohua Wu

{"title":"基于dqn的混合“零件到拣货人”拣货系统的拣货和补货综合决策","authors":"Xin Wang, Yaohua Wu","doi":"10.1002/ett.70277","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Some large distribution centers have introduced a hybrid “parts-to-picker” order picking system consisting of a pallet warehouse and a tote warehouse to meet diverse order requirements. This system enables collaborative operations between two warehouses during picking and centralized replenishment processes. Additionally, it innovatively allows surplus goods remaining on pallets after picking to be replenished into the tote warehouse. Therefore, making informed decisions during operations such as picking, centralized replenishment, and picking replenishment will significantly enhance overall warehouse operational efficiency. However, to the best of our knowledge, such research has not yet been conducted. This paper addresses the comprehensive decision-making problem of replenishment and picking in a hybrid “parts-to-picker” order picking system. To solve it, we propose an intelligent decision-making framework based on deep reinforcement learning (DQN). We design a state space that incorporates predictive orders, composite warehouse inventory, and warehouse unit status. Furthermore, we design an action space that includes centralized replenishment, picking replenishment, and picking actions. This approach ultimately achieves three objectives: the allocation of quantities for pallet and tote picking, the allocation of quantities for centralized replenishment, and decisions on whether to replenish after picking. The DQN model also combines a reward function that includes penalty factors with an <span></span><math></math>-greedy decay strategy, effectively improving the goal of order processing efficiency. The experimental results show that, compared with traditional scheduling strategies and intelligent algorithms, the decision-making model trained by the DQN architecture proposed in this paper can respond quickly in the comprehensive decision-making of picking and replenishment in a hybrid “parts-to-picker” order picking system, and can significantly improve system efficiency. The DQN model improves efficiency by approximately 44% and 17% compared to empirical decision-making and meta-heuristic algorithms, respectively. This study provides a solution that combines theoretical innovation and engineering feasibility for the multi-functional collaborative optimization of smart warehouse logistics systems, demonstrating significant practical value.</p>\n </div>","PeriodicalId":23282,"journal":{"name":"Transactions on Emerging Telecommunications Technologies","volume":"36 10","pages":""},"PeriodicalIF":2.5000,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Comprehensive Decision-Making for Picking and Replenishment in a DQN-Based Hybrid “Parts-To-Picker” Order Picking System\",\"authors\":\"Xin Wang, Yaohua Wu\",\"doi\":\"10.1002/ett.70277\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n <p>Some large distribution centers have introduced a hybrid “parts-to-picker” order picking system consisting of a pallet warehouse and a tote warehouse to meet diverse order requirements. This system enables collaborative operations between two warehouses during picking and centralized replenishment processes. Additionally, it innovatively allows surplus goods remaining on pallets after picking to be replenished into the tote warehouse. Therefore, making informed decisions during operations such as picking, centralized replenishment, and picking replenishment will significantly enhance overall warehouse operational efficiency. However, to the best of our knowledge, such research has not yet been conducted. This paper addresses the comprehensive decision-making problem of replenishment and picking in a hybrid “parts-to-picker” order picking system. To solve it, we propose an intelligent decision-making framework based on deep reinforcement learning (DQN). We design a state space that incorporates predictive orders, composite warehouse inventory, and warehouse unit status. Furthermore, we design an action space that includes centralized replenishment, picking replenishment, and picking actions. This approach ultimately achieves three objectives: the allocation of quantities for pallet and tote picking, the allocation of quantities for centralized replenishment, and decisions on whether to replenish after picking. The DQN model also combines a reward function that includes penalty factors with an <span></span><math></math>-greedy decay strategy, effectively improving the goal of order processing efficiency. The experimental results show that, compared with traditional scheduling strategies and intelligent algorithms, the decision-making model trained by the DQN architecture proposed in this paper can respond quickly in the comprehensive decision-making of picking and replenishment in a hybrid “parts-to-picker” order picking system, and can significantly improve system efficiency. The DQN model improves efficiency by approximately 44% and 17% compared to empirical decision-making and meta-heuristic algorithms, respectively. This study provides a solution that combines theoretical innovation and engineering feasibility for the multi-functional collaborative optimization of smart warehouse logistics systems, demonstrating significant practical value.</p>\\n </div>\",\"PeriodicalId\":23282,\"journal\":{\"name\":\"Transactions on Emerging Telecommunications Technologies\",\"volume\":\"36 10\",\"pages\":\"\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2025-10-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Transactions on Emerging Telecommunications Technologies\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/ett.70277\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"TELECOMMUNICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transactions on Emerging Telecommunications Technologies","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ett.70277","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"TELECOMMUNICATIONS","Score":null,"Total":0}

引用次数: 0

摘要

一些大型配送中心引入了由托盘仓库和手提袋仓库组成的混合“零件到拾取者”订单拾取系统，以满足不同的订单需求。该系统使两个仓库之间的协作操作在拣选和集中补充过程中。此外，它创新地允许剩余货物在挑选后留在托盘上被补充到手提袋仓库。因此，在拣货、集中补货、拣货补货等操作过程中做出明智的决策，将显著提高仓库的整体运营效率。然而，据我们所知，还没有进行过这样的研究。研究了混合“零件到拣货人”拣货系统中补货和拣货的综合决策问题。为了解决这个问题，我们提出了一个基于深度强化学习（DQN）的智能决策框架。我们设计了一个包含预测订单、复合仓库库存和仓库单元状态的状态空间。此外，我们设计了一个行动空间，包括集中补货、拣货补货和拣货动作。这种方法最终实现了三个目标：托盘和手提袋拣货的数量分配，集中补货的数量分配，拣货后是否补货的决定。DQN模型还将包含惩罚因子的奖励函数与贪心衰减策略相结合，有效地提高了订单处理效率的目标。实验结果表明，与传统的调度策略和智能算法相比，本文提出的DQN架构训练的决策模型能够快速响应混合“零件到拣货人”订单拣货系统的拣货和补货综合决策，显著提高系统效率。与经验决策和元启发式算法相比，DQN模型分别提高了约44%和17%的效率。本研究为智能仓储物流系统的多功能协同优化提供了理论创新与工程可行性相结合的解决方案，具有重要的实用价值。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Comprehensive Decision-Making for Picking and Replenishment in a DQN-Based Hybrid “Parts-To-Picker” Order Picking System

查看原文本刊更多论文

Comprehensive Decision-Making for Picking and Replenishment in a DQN-Based Hybrid “Parts-To-Picker” Order Picking System

Some large distribution centers have introduced a hybrid “parts-to-picker” order picking system consisting of a pallet warehouse and a tote warehouse to meet diverse order requirements. This system enables collaborative operations between two warehouses during picking and centralized replenishment processes. Additionally, it innovatively allows surplus goods remaining on pallets after picking to be replenished into the tote warehouse. Therefore, making informed decisions during operations such as picking, centralized replenishment, and picking replenishment will significantly enhance overall warehouse operational efficiency. However, to the best of our knowledge, such research has not yet been conducted. This paper addresses the comprehensive decision-making problem of replenishment and picking in a hybrid “parts-to-picker” order picking system. To solve it, we propose an intelligent decision-making framework based on deep reinforcement learning (DQN). We design a state space that incorporates predictive orders, composite warehouse inventory, and warehouse unit status. Furthermore, we design an action space that includes centralized replenishment, picking replenishment, and picking actions. This approach ultimately achieves three objectives: the allocation of quantities for pallet and tote picking, the allocation of quantities for centralized replenishment, and decisions on whether to replenish after picking. The DQN model also combines a reward function that includes penalty factors with an -greedy decay strategy, effectively improving the goal of order processing efficiency. The experimental results show that, compared with traditional scheduling strategies and intelligent algorithms, the decision-making model trained by the DQN architecture proposed in this paper can respond quickly in the comprehensive decision-making of picking and replenishment in a hybrid “parts-to-picker” order picking system, and can significantly improve system efficiency. The DQN model improves efficiency by approximately 44% and 17% compared to empirical decision-making and meta-heuristic algorithms, respectively. This study provides a solution that combines theoretical innovation and engineering feasibility for the multi-functional collaborative optimization of smart warehouse logistics systems, demonstrating significant practical value.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Transactions on Emerging Telecommunications Technologies TELECOMMUNICATIONS-

CiteScore

8.90

自引率

13.90%

发文量

249

期刊介绍： ransactions on Emerging Telecommunications Technologies (ETT), formerly known as European Transactions on Telecommunications (ETT), has the following aims: - to attract cutting-edge publications from leading researchers and research groups around the world - to become a highly cited source of timely research findings in emerging fields of telecommunications - to limit revision and publication cycles to a few months and thus significantly increase attractiveness to publish - to become the leading journal for publishing the latest developments in telecommunications