Q-Learning Algorithm for Fourth Party Logistics Route Optimization Considering Tardiness Risk

2022 International Conference on Cyber-Physical Social Intelligence (ICCSI) Pub Date : 2022-11-18 DOI:10.1109/ICCSI55536.2022.9970625

Xin Liu, Guihua Bo

{"title":"Q-Learning Algorithm for Fourth Party Logistics Route Optimization Considering Tardiness Risk","authors":"Xin Liu, Guihua Bo","doi":"10.1109/ICCSI55536.2022.9970625","DOIUrl":null,"url":null,"abstract":"To solve the routing optimization problem in the fourth party logistics under the dynamic and complex environment which may lead to the delivery task not being completed on time, this paper introduces the value at risk (VaR) to measure the tardiness risk, and establishes the tardiness risk as the objective function and the delivery cost as the constraint condition. Mathematical model with the aim of providing customers with satisfactory delivery services at limited costs and with minimal risk of delays. Based on the nonlinear and NP-hard characteristics of the problem, the Q-learning algorithm is combined with the fourth party logistics routing optimization problem (4PLROP), and the reward value is redesigned and defined. Several different scales of numerical computations are performed, results of three algorithms are compared, and the experiment results show that the constructed random model can control the tardiness risk effectively and the presented algorithms can obtain satisfactory solutions quickly according to the customer's different confidence levels.","PeriodicalId":421514,"journal":{"name":"2022 International Conference on Cyber-Physical Social Intelligence (ICCSI)","volume":"40 3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Cyber-Physical Social Intelligence (ICCSI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCSI55536.2022.9970625","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

To solve the routing optimization problem in the fourth party logistics under the dynamic and complex environment which may lead to the delivery task not being completed on time, this paper introduces the value at risk (VaR) to measure the tardiness risk, and establishes the tardiness risk as the objective function and the delivery cost as the constraint condition. Mathematical model with the aim of providing customers with satisfactory delivery services at limited costs and with minimal risk of delays. Based on the nonlinear and NP-hard characteristics of the problem, the Q-learning algorithm is combined with the fourth party logistics routing optimization problem (4PLROP), and the reward value is redesigned and defined. Several different scales of numerical computations are performed, results of three algorithms are compared, and the experiment results show that the constructed random model can control the tardiness risk effectively and the presented algorithms can obtain satisfactory solutions quickly according to the customer's different confidence levels.

查看原文本刊更多论文

考虑延误风险的第四方物流路线优化q -学习算法

为解决动态复杂环境下可能导致配送任务不能按时完成的第四方物流路线优化问题，引入风险值(VaR)来度量延迟风险，并以延迟风险为目标函数，以配送成本为约束条件。数学模型的目的是以有限的成本和最小的延迟风险为客户提供满意的交付服务。基于问题的非线性和NP-hard特性，将q -学习算法与第四方物流路线优化问题(4PLROP)相结合，重新设计并定义了奖励值。进行了几种不同尺度的数值计算，比较了三种算法的结果，实验结果表明，所构建的随机模型能够有效地控制延迟风险，所提出的算法能够根据客户不同的置信度快速得到满意的解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 International Conference on Cyber-Physical Social Intelligence (ICCSI)

自引率

0.00%

发文量