{"title":"基于dqn的混合“零件到拣货人”拣货系统的拣货和补货综合决策","authors":"Xin Wang, Yaohua Wu","doi":"10.1002/ett.70277","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Some large distribution centers have introduced a hybrid “parts-to-picker” order picking system consisting of a pallet warehouse and a tote warehouse to meet diverse order requirements. This system enables collaborative operations between two warehouses during picking and centralized replenishment processes. Additionally, it innovatively allows surplus goods remaining on pallets after picking to be replenished into the tote warehouse. Therefore, making informed decisions during operations such as picking, centralized replenishment, and picking replenishment will significantly enhance overall warehouse operational efficiency. However, to the best of our knowledge, such research has not yet been conducted. This paper addresses the comprehensive decision-making problem of replenishment and picking in a hybrid “parts-to-picker” order picking system. To solve it, we propose an intelligent decision-making framework based on deep reinforcement learning (DQN). We design a state space that incorporates predictive orders, composite warehouse inventory, and warehouse unit status. Furthermore, we design an action space that includes centralized replenishment, picking replenishment, and picking actions. This approach ultimately achieves three objectives: the allocation of quantities for pallet and tote picking, the allocation of quantities for centralized replenishment, and decisions on whether to replenish after picking. The DQN model also combines a reward function that includes penalty factors with an <span></span><math></math>-greedy decay strategy, effectively improving the goal of order processing efficiency. The experimental results show that, compared with traditional scheduling strategies and intelligent algorithms, the decision-making model trained by the DQN architecture proposed in this paper can respond quickly in the comprehensive decision-making of picking and replenishment in a hybrid “parts-to-picker” order picking system, and can significantly improve system efficiency. The DQN model improves efficiency by approximately 44% and 17% compared to empirical decision-making and meta-heuristic algorithms, respectively. This study provides a solution that combines theoretical innovation and engineering feasibility for the multi-functional collaborative optimization of smart warehouse logistics systems, demonstrating significant practical value.</p>\n </div>","PeriodicalId":23282,"journal":{"name":"Transactions on Emerging Telecommunications Technologies","volume":"36 10","pages":""},"PeriodicalIF":2.5000,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Comprehensive Decision-Making for Picking and Replenishment in a DQN-Based Hybrid “Parts-To-Picker” Order Picking System\",\"authors\":\"Xin Wang, Yaohua Wu\",\"doi\":\"10.1002/ett.70277\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n <p>Some large distribution centers have introduced a hybrid “parts-to-picker” order picking system consisting of a pallet warehouse and a tote warehouse to meet diverse order requirements. This system enables collaborative operations between two warehouses during picking and centralized replenishment processes. Additionally, it innovatively allows surplus goods remaining on pallets after picking to be replenished into the tote warehouse. Therefore, making informed decisions during operations such as picking, centralized replenishment, and picking replenishment will significantly enhance overall warehouse operational efficiency. However, to the best of our knowledge, such research has not yet been conducted. This paper addresses the comprehensive decision-making problem of replenishment and picking in a hybrid “parts-to-picker” order picking system. To solve it, we propose an intelligent decision-making framework based on deep reinforcement learning (DQN). We design a state space that incorporates predictive orders, composite warehouse inventory, and warehouse unit status. Furthermore, we design an action space that includes centralized replenishment, picking replenishment, and picking actions. This approach ultimately achieves three objectives: the allocation of quantities for pallet and tote picking, the allocation of quantities for centralized replenishment, and decisions on whether to replenish after picking. The DQN model also combines a reward function that includes penalty factors with an <span></span><math></math>-greedy decay strategy, effectively improving the goal of order processing efficiency. The experimental results show that, compared with traditional scheduling strategies and intelligent algorithms, the decision-making model trained by the DQN architecture proposed in this paper can respond quickly in the comprehensive decision-making of picking and replenishment in a hybrid “parts-to-picker” order picking system, and can significantly improve system efficiency. The DQN model improves efficiency by approximately 44% and 17% compared to empirical decision-making and meta-heuristic algorithms, respectively. This study provides a solution that combines theoretical innovation and engineering feasibility for the multi-functional collaborative optimization of smart warehouse logistics systems, demonstrating significant practical value.</p>\\n </div>\",\"PeriodicalId\":23282,\"journal\":{\"name\":\"Transactions on Emerging Telecommunications Technologies\",\"volume\":\"36 10\",\"pages\":\"\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2025-10-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Transactions on Emerging Telecommunications Technologies\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/ett.70277\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"TELECOMMUNICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transactions on Emerging Telecommunications Technologies","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ett.70277","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"TELECOMMUNICATIONS","Score":null,"Total":0}
Comprehensive Decision-Making for Picking and Replenishment in a DQN-Based Hybrid “Parts-To-Picker” Order Picking System
Some large distribution centers have introduced a hybrid “parts-to-picker” order picking system consisting of a pallet warehouse and a tote warehouse to meet diverse order requirements. This system enables collaborative operations between two warehouses during picking and centralized replenishment processes. Additionally, it innovatively allows surplus goods remaining on pallets after picking to be replenished into the tote warehouse. Therefore, making informed decisions during operations such as picking, centralized replenishment, and picking replenishment will significantly enhance overall warehouse operational efficiency. However, to the best of our knowledge, such research has not yet been conducted. This paper addresses the comprehensive decision-making problem of replenishment and picking in a hybrid “parts-to-picker” order picking system. To solve it, we propose an intelligent decision-making framework based on deep reinforcement learning (DQN). We design a state space that incorporates predictive orders, composite warehouse inventory, and warehouse unit status. Furthermore, we design an action space that includes centralized replenishment, picking replenishment, and picking actions. This approach ultimately achieves three objectives: the allocation of quantities for pallet and tote picking, the allocation of quantities for centralized replenishment, and decisions on whether to replenish after picking. The DQN model also combines a reward function that includes penalty factors with an -greedy decay strategy, effectively improving the goal of order processing efficiency. The experimental results show that, compared with traditional scheduling strategies and intelligent algorithms, the decision-making model trained by the DQN architecture proposed in this paper can respond quickly in the comprehensive decision-making of picking and replenishment in a hybrid “parts-to-picker” order picking system, and can significantly improve system efficiency. The DQN model improves efficiency by approximately 44% and 17% compared to empirical decision-making and meta-heuristic algorithms, respectively. This study provides a solution that combines theoretical innovation and engineering feasibility for the multi-functional collaborative optimization of smart warehouse logistics systems, demonstrating significant practical value.
期刊介绍:
ransactions on Emerging Telecommunications Technologies (ETT), formerly known as European Transactions on Telecommunications (ETT), has the following aims:
- to attract cutting-edge publications from leading researchers and research groups around the world
- to become a highly cited source of timely research findings in emerging fields of telecommunications
- to limit revision and publication cycles to a few months and thus significantly increase attractiveness to publish
- to become the leading journal for publishing the latest developments in telecommunications