Kaibo Liang , Man Shan , Huwei Liu , Jianglong Yang , Chenxi Gu , Xiangyu Yin
{"title":"基于启发式策略的深度强化学习的电子商务订单包装智能优化","authors":"Kaibo Liang , Man Shan , Huwei Liu , Jianglong Yang , Chenxi Gu , Xiangyu Yin","doi":"10.1016/j.asoc.2025.113283","DOIUrl":null,"url":null,"abstract":"<div><div>The rapid expansion of e-commerce has intensified demands for efficient logistics, particularly in optimizing three-dimensional bin packing (3D-BPP) to balance space utilization, operational costs, and sustainability. Traditional methods often fail to address the dynamic, multi-constrained nature of e-commerce orders, which involve diverse item combinations, real-time decision-making, and complex practical constraints. This study proposes a hybrid framework that integrates deep reinforcement learning (DRL) with heuristic strategies to tackle these challenges. We first formulate a comprehensive mathematical model for 3D-BPP that explicitly incorporates rotation, boundary, and non-overlapping constraints. Building on this foundation, we develop a heuristic strategy system with five operator components for bin selection, item grouping, packing sequence, position selection, and orientation determination. To enhance adaptability, we introduce two DRL algorithms: the Order Packing Optimization DRL (OPO-DRL) for dynamic item sequencing and the Packing Combination Strategy DRL (PCS-DRL) for adaptive operator selection. The hybrid framework synergizes DRL’s learning capabilities with heuristic efficiency, enabling real-time adjustments to varying order patterns and bin specifications. Experimental validation using real-world data from JD.com demonstrates significant improvements, achieving an average packing rate of 68.60% with computation times of 0.16 s per order, outperforming state-of-the-art methods. Statistical analysis confirms significant improvements in both solution quality and computational efficiency compared to existing approaches. This work bridges theoretical optimization with operational realities, providing a scalable solution for modern warehouse automation and intelligent logistics systems.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"179 ","pages":"Article 113283"},"PeriodicalIF":7.2000,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Intelligent optimization of e-commerce order packing using deep reinforcement learning with heuristic strategies\",\"authors\":\"Kaibo Liang , Man Shan , Huwei Liu , Jianglong Yang , Chenxi Gu , Xiangyu Yin\",\"doi\":\"10.1016/j.asoc.2025.113283\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The rapid expansion of e-commerce has intensified demands for efficient logistics, particularly in optimizing three-dimensional bin packing (3D-BPP) to balance space utilization, operational costs, and sustainability. Traditional methods often fail to address the dynamic, multi-constrained nature of e-commerce orders, which involve diverse item combinations, real-time decision-making, and complex practical constraints. This study proposes a hybrid framework that integrates deep reinforcement learning (DRL) with heuristic strategies to tackle these challenges. We first formulate a comprehensive mathematical model for 3D-BPP that explicitly incorporates rotation, boundary, and non-overlapping constraints. Building on this foundation, we develop a heuristic strategy system with five operator components for bin selection, item grouping, packing sequence, position selection, and orientation determination. To enhance adaptability, we introduce two DRL algorithms: the Order Packing Optimization DRL (OPO-DRL) for dynamic item sequencing and the Packing Combination Strategy DRL (PCS-DRL) for adaptive operator selection. The hybrid framework synergizes DRL’s learning capabilities with heuristic efficiency, enabling real-time adjustments to varying order patterns and bin specifications. Experimental validation using real-world data from JD.com demonstrates significant improvements, achieving an average packing rate of 68.60% with computation times of 0.16 s per order, outperforming state-of-the-art methods. Statistical analysis confirms significant improvements in both solution quality and computational efficiency compared to existing approaches. This work bridges theoretical optimization with operational realities, providing a scalable solution for modern warehouse automation and intelligent logistics systems.</div></div>\",\"PeriodicalId\":50737,\"journal\":{\"name\":\"Applied Soft Computing\",\"volume\":\"179 \",\"pages\":\"Article 113283\"},\"PeriodicalIF\":7.2000,\"publicationDate\":\"2025-05-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Soft Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1568494625005940\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1568494625005940","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Intelligent optimization of e-commerce order packing using deep reinforcement learning with heuristic strategies
The rapid expansion of e-commerce has intensified demands for efficient logistics, particularly in optimizing three-dimensional bin packing (3D-BPP) to balance space utilization, operational costs, and sustainability. Traditional methods often fail to address the dynamic, multi-constrained nature of e-commerce orders, which involve diverse item combinations, real-time decision-making, and complex practical constraints. This study proposes a hybrid framework that integrates deep reinforcement learning (DRL) with heuristic strategies to tackle these challenges. We first formulate a comprehensive mathematical model for 3D-BPP that explicitly incorporates rotation, boundary, and non-overlapping constraints. Building on this foundation, we develop a heuristic strategy system with five operator components for bin selection, item grouping, packing sequence, position selection, and orientation determination. To enhance adaptability, we introduce two DRL algorithms: the Order Packing Optimization DRL (OPO-DRL) for dynamic item sequencing and the Packing Combination Strategy DRL (PCS-DRL) for adaptive operator selection. The hybrid framework synergizes DRL’s learning capabilities with heuristic efficiency, enabling real-time adjustments to varying order patterns and bin specifications. Experimental validation using real-world data from JD.com demonstrates significant improvements, achieving an average packing rate of 68.60% with computation times of 0.16 s per order, outperforming state-of-the-art methods. Statistical analysis confirms significant improvements in both solution quality and computational efficiency compared to existing approaches. This work bridges theoretical optimization with operational realities, providing a scalable solution for modern warehouse automation and intelligent logistics systems.
期刊介绍:
Applied Soft Computing is an international journal promoting an integrated view of soft computing to solve real life problems.The focus is to publish the highest quality research in application and convergence of the areas of Fuzzy Logic, Neural Networks, Evolutionary Computing, Rough Sets and other similar techniques to address real world complexities.
Applied Soft Computing is a rolling publication: articles are published as soon as the editor-in-chief has accepted them. Therefore, the web site will continuously be updated with new articles and the publication time will be short.