供应链订单分类的深度学习和策略优化方法

Ramakrishna Garine , Ripon K. Chakrabortty
{"title":"供应链订单分类的深度学习和策略优化方法","authors":"Ramakrishna Garine ,&nbsp;Ripon K. Chakrabortty","doi":"10.1016/j.sca.2025.100166","DOIUrl":null,"url":null,"abstract":"<div><div>Timely delivery is a critical performance metric in supply chain management, yet achieving consistent on-time delivery has become increasingly challenging in the face of global uncertainties and complex logistics networks. Recent disruptions, such as pandemics, extreme weather events, and geopolitical conflicts, have exposed vulnerabilities in supply chains, resulting in frequent delivery delays. While traditional heuristics and simple statistical methods have proven inadequate to capture the myriad factors that contribute to delays in modern supply chains, Machine learning (ML) and Deep Learning (DL) approaches have emerged as powerful tools to improve the accuracy and reliability of delivery delay prediction. Consequently, this study presents a hybrid predictive framework that integrates DL models with Reinforcement Learning (RL) to improve binary classification of order status (on-time vs. late). We first benchmark several DL architectures, Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), Bi-LSTM, and Stacked LSTM, enhanced with regularization and extended training epochs, alongside a fine-tuned eXtreme Gradient Boost (XGBoost) model. These models are evaluated using accuracy, precision, recall, and the F1-score, with Bi-LSTM and Stacked LSTM achieving strong generalization performance. Building on this, we deploy a Proximal Policy Optimization (PPO) agent that incorporates deep learning outputs as part of its observation space. The RL agent uses a reward-based feedback loop to improve adaptability under dynamic conditions. Experimental results show that the hybrid DL-RL model achieves superior classification accuracy and an F1-score greater than 0.99, outperforming standalone methods. Although the PPO agent alone struggled with detecting minorities due to imbalance, integrating DL features mitigated this limitation. The findings support the use of hybrid architectures for real-time order status prediction and provide a scalable pathway for intelligent supply chain decision making. Future work will address class imbalance and enhance policy robustness through cost-sensitive and explainable RL strategies.</div></div>","PeriodicalId":101186,"journal":{"name":"Supply Chain Analytics","volume":"12 ","pages":"Article 100166"},"PeriodicalIF":0.0000,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A deep learning and policy optimization approach for supply chain order classification\",\"authors\":\"Ramakrishna Garine ,&nbsp;Ripon K. Chakrabortty\",\"doi\":\"10.1016/j.sca.2025.100166\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Timely delivery is a critical performance metric in supply chain management, yet achieving consistent on-time delivery has become increasingly challenging in the face of global uncertainties and complex logistics networks. Recent disruptions, such as pandemics, extreme weather events, and geopolitical conflicts, have exposed vulnerabilities in supply chains, resulting in frequent delivery delays. While traditional heuristics and simple statistical methods have proven inadequate to capture the myriad factors that contribute to delays in modern supply chains, Machine learning (ML) and Deep Learning (DL) approaches have emerged as powerful tools to improve the accuracy and reliability of delivery delay prediction. Consequently, this study presents a hybrid predictive framework that integrates DL models with Reinforcement Learning (RL) to improve binary classification of order status (on-time vs. late). We first benchmark several DL architectures, Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), Bi-LSTM, and Stacked LSTM, enhanced with regularization and extended training epochs, alongside a fine-tuned eXtreme Gradient Boost (XGBoost) model. These models are evaluated using accuracy, precision, recall, and the F1-score, with Bi-LSTM and Stacked LSTM achieving strong generalization performance. Building on this, we deploy a Proximal Policy Optimization (PPO) agent that incorporates deep learning outputs as part of its observation space. The RL agent uses a reward-based feedback loop to improve adaptability under dynamic conditions. Experimental results show that the hybrid DL-RL model achieves superior classification accuracy and an F1-score greater than 0.99, outperforming standalone methods. Although the PPO agent alone struggled with detecting minorities due to imbalance, integrating DL features mitigated this limitation. The findings support the use of hybrid architectures for real-time order status prediction and provide a scalable pathway for intelligent supply chain decision making. Future work will address class imbalance and enhance policy robustness through cost-sensitive and explainable RL strategies.</div></div>\",\"PeriodicalId\":101186,\"journal\":{\"name\":\"Supply Chain Analytics\",\"volume\":\"12 \",\"pages\":\"Article 100166\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-09-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Supply Chain Analytics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2949863525000664\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Supply Chain Analytics","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949863525000664","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

及时交货是供应链管理的关键绩效指标,但面对全球不确定性和复杂的物流网络,实现一致的准时交货变得越来越具有挑战性。最近的中断,如流行病、极端天气事件和地缘政治冲突,暴露了供应链的脆弱性,导致频繁的交货延迟。虽然传统的启发式方法和简单的统计方法已被证明不足以捕捉导致现代供应链延迟的无数因素,但机器学习(ML)和深度学习(DL)方法已成为提高交付延迟预测准确性和可靠性的强大工具。因此,本研究提出了一个混合预测框架,该框架将深度学习模型与强化学习(RL)集成在一起,以改进订单状态(准时与延迟)的二元分类。我们首先测试了几种深度学习架构,卷积神经网络(CNN),长短期记忆(LSTM), Bi-LSTM和堆叠LSTM,通过正则化和扩展的训练时代增强,以及微调的极限梯度增强(XGBoost)模型。这些模型使用准确率、精密度、召回率和f1分数进行评估,其中Bi-LSTM和堆叠LSTM具有较强的泛化性能。在此基础上,我们部署了一个近端策略优化(PPO)代理,该代理将深度学习输出作为其观察空间的一部分。RL代理使用基于奖励的反馈回路来提高动态条件下的适应性。实验结果表明,混合DL-RL模型具有较好的分类精度,f1得分大于0.99,优于独立模型。虽然PPO代理由于不平衡而难以检测少数群体,但集成DL功能减轻了这一限制。研究结果支持使用混合架构进行实时订单状态预测,并为智能供应链决策提供可扩展的途径。未来的工作将通过成本敏感和可解释的RL策略来解决阶级不平衡问题并增强政策稳健性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A deep learning and policy optimization approach for supply chain order classification
Timely delivery is a critical performance metric in supply chain management, yet achieving consistent on-time delivery has become increasingly challenging in the face of global uncertainties and complex logistics networks. Recent disruptions, such as pandemics, extreme weather events, and geopolitical conflicts, have exposed vulnerabilities in supply chains, resulting in frequent delivery delays. While traditional heuristics and simple statistical methods have proven inadequate to capture the myriad factors that contribute to delays in modern supply chains, Machine learning (ML) and Deep Learning (DL) approaches have emerged as powerful tools to improve the accuracy and reliability of delivery delay prediction. Consequently, this study presents a hybrid predictive framework that integrates DL models with Reinforcement Learning (RL) to improve binary classification of order status (on-time vs. late). We first benchmark several DL architectures, Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), Bi-LSTM, and Stacked LSTM, enhanced with regularization and extended training epochs, alongside a fine-tuned eXtreme Gradient Boost (XGBoost) model. These models are evaluated using accuracy, precision, recall, and the F1-score, with Bi-LSTM and Stacked LSTM achieving strong generalization performance. Building on this, we deploy a Proximal Policy Optimization (PPO) agent that incorporates deep learning outputs as part of its observation space. The RL agent uses a reward-based feedback loop to improve adaptability under dynamic conditions. Experimental results show that the hybrid DL-RL model achieves superior classification accuracy and an F1-score greater than 0.99, outperforming standalone methods. Although the PPO agent alone struggled with detecting minorities due to imbalance, integrating DL features mitigated this limitation. The findings support the use of hybrid architectures for real-time order status prediction and provide a scalable pathway for intelligent supply chain decision making. Future work will address class imbalance and enhance policy robustness through cost-sensitive and explainable RL strategies.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信