基于启发式策略的深度强化学习的电子商务订单包装智能优化

IF 7.2 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Applied Soft Computing Pub Date : 2025-05-30 DOI:10.1016/j.asoc.2025.113283

Kaibo Liang , Man Shan , Huwei Liu , Jianglong Yang , Chenxi Gu , Xiangyu Yin

{"title":"基于启发式策略的深度强化学习的电子商务订单包装智能优化","authors":"Kaibo Liang , Man Shan , Huwei Liu , Jianglong Yang , Chenxi Gu , Xiangyu Yin","doi":"10.1016/j.asoc.2025.113283","DOIUrl":null,"url":null,"abstract":"<div><div>The rapid expansion of e-commerce has intensified demands for efficient logistics, particularly in optimizing three-dimensional bin packing (3D-BPP) to balance space utilization, operational costs, and sustainability. Traditional methods often fail to address the dynamic, multi-constrained nature of e-commerce orders, which involve diverse item combinations, real-time decision-making, and complex practical constraints. This study proposes a hybrid framework that integrates deep reinforcement learning (DRL) with heuristic strategies to tackle these challenges. We first formulate a comprehensive mathematical model for 3D-BPP that explicitly incorporates rotation, boundary, and non-overlapping constraints. Building on this foundation, we develop a heuristic strategy system with five operator components for bin selection, item grouping, packing sequence, position selection, and orientation determination. To enhance adaptability, we introduce two DRL algorithms: the Order Packing Optimization DRL (OPO-DRL) for dynamic item sequencing and the Packing Combination Strategy DRL (PCS-DRL) for adaptive operator selection. The hybrid framework synergizes DRL’s learning capabilities with heuristic efficiency, enabling real-time adjustments to varying order patterns and bin specifications. Experimental validation using real-world data from JD.com demonstrates significant improvements, achieving an average packing rate of 68.60% with computation times of 0.16 s per order, outperforming state-of-the-art methods. Statistical analysis confirms significant improvements in both solution quality and computational efficiency compared to existing approaches. This work bridges theoretical optimization with operational realities, providing a scalable solution for modern warehouse automation and intelligent logistics systems.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"179 ","pages":"Article 113283"},"PeriodicalIF":7.2000,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Intelligent optimization of e-commerce order packing using deep reinforcement learning with heuristic strategies\",\"authors\":\"Kaibo Liang , Man Shan , Huwei Liu , Jianglong Yang , Chenxi Gu , Xiangyu Yin\",\"doi\":\"10.1016/j.asoc.2025.113283\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The rapid expansion of e-commerce has intensified demands for efficient logistics, particularly in optimizing three-dimensional bin packing (3D-BPP) to balance space utilization, operational costs, and sustainability. Traditional methods often fail to address the dynamic, multi-constrained nature of e-commerce orders, which involve diverse item combinations, real-time decision-making, and complex practical constraints. This study proposes a hybrid framework that integrates deep reinforcement learning (DRL) with heuristic strategies to tackle these challenges. We first formulate a comprehensive mathematical model for 3D-BPP that explicitly incorporates rotation, boundary, and non-overlapping constraints. Building on this foundation, we develop a heuristic strategy system with five operator components for bin selection, item grouping, packing sequence, position selection, and orientation determination. To enhance adaptability, we introduce two DRL algorithms: the Order Packing Optimization DRL (OPO-DRL) for dynamic item sequencing and the Packing Combination Strategy DRL (PCS-DRL) for adaptive operator selection. The hybrid framework synergizes DRL’s learning capabilities with heuristic efficiency, enabling real-time adjustments to varying order patterns and bin specifications. Experimental validation using real-world data from JD.com demonstrates significant improvements, achieving an average packing rate of 68.60% with computation times of 0.16 s per order, outperforming state-of-the-art methods. Statistical analysis confirms significant improvements in both solution quality and computational efficiency compared to existing approaches. This work bridges theoretical optimization with operational realities, providing a scalable solution for modern warehouse automation and intelligent logistics systems.</div></div>\",\"PeriodicalId\":50737,\"journal\":{\"name\":\"Applied Soft Computing\",\"volume\":\"179 \",\"pages\":\"Article 113283\"},\"PeriodicalIF\":7.2000,\"publicationDate\":\"2025-05-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Soft Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1568494625005940\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1568494625005940","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

电子商务的快速发展加剧了对高效物流的需求，特别是在优化三维箱包装（3D-BPP）方面，以平衡空间利用率、运营成本和可持续性。传统的方法往往不能解决电子商务订单的动态性、多约束性质，这涉及到多种商品组合、实时决策和复杂的实际约束。本研究提出了一个混合框架，将深度强化学习（DRL）与启发式策略相结合，以应对这些挑战。我们首先为3D-BPP制定了一个全面的数学模型，明确地结合了旋转、边界和非重叠约束。在此基础上，我们开发了一个启发式策略系统，该系统包含五个操作组件，分别用于箱子选择、物品分组、包装顺序、位置选择和方向确定。为了提高DRL的自适应性，我们引入了两种DRL算法：用于动态项目排序的Order Packing Optimization DRL （OPO-DRL）和用于自适应算子选择的Packing Combination Strategy DRL （PCS-DRL）。混合框架将DRL的学习能力与启发式效率协同起来，能够实时调整不同的订单模式和bin规格。使用京东真实数据进行的实验验证显示了显着的改进，平均包装率达到68.60%，每个订单的计算时间为0.16秒，优于最先进的方法。统计分析证实，与现有方法相比，解决方案质量和计算效率都有显著提高。这项工作将理论优化与实际操作联系起来，为现代仓库自动化和智能物流系统提供了可扩展的解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Intelligent optimization of e-commerce order packing using deep reinforcement learning with heuristic strategies

The rapid expansion of e-commerce has intensified demands for efficient logistics, particularly in optimizing three-dimensional bin packing (3D-BPP) to balance space utilization, operational costs, and sustainability. Traditional methods often fail to address the dynamic, multi-constrained nature of e-commerce orders, which involve diverse item combinations, real-time decision-making, and complex practical constraints. This study proposes a hybrid framework that integrates deep reinforcement learning (DRL) with heuristic strategies to tackle these challenges. We first formulate a comprehensive mathematical model for 3D-BPP that explicitly incorporates rotation, boundary, and non-overlapping constraints. Building on this foundation, we develop a heuristic strategy system with five operator components for bin selection, item grouping, packing sequence, position selection, and orientation determination. To enhance adaptability, we introduce two DRL algorithms: the Order Packing Optimization DRL (OPO-DRL) for dynamic item sequencing and the Packing Combination Strategy DRL (PCS-DRL) for adaptive operator selection. The hybrid framework synergizes DRL’s learning capabilities with heuristic efficiency, enabling real-time adjustments to varying order patterns and bin specifications. Experimental validation using real-world data from JD.com demonstrates significant improvements, achieving an average packing rate of 68.60% with computation times of 0.16 s per order, outperforming state-of-the-art methods. Statistical analysis confirms significant improvements in both solution quality and computational efficiency compared to existing approaches. This work bridges theoretical optimization with operational realities, providing a scalable solution for modern warehouse automation and intelligent logistics systems.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Applied Soft Computing 工程技术-计算机：跨学科应用

CiteScore

15.80

自引率

6.90%

发文量

874

审稿时长

10.9 months

期刊介绍： Applied Soft Computing is an international journal promoting an integrated view of soft computing to solve real life problems.The focus is to publish the highest quality research in application and convergence of the areas of Fuzzy Logic, Neural Networks, Evolutionary Computing, Rough Sets and other similar techniques to address real world complexities. Applied Soft Computing is a rolling publication: articles are published as soon as the editor-in-chief has accepted them. Therefore, the web site will continuously be updated with new articles and the publication time will be short.