Ran Tian;Zhihui Sun;Longlong Chang;Jiarui Wu;Xin Lu
{"title":"基于改进Actor-Critic深度强化学习的灵活取货服务快速解决方案","authors":"Ran Tian;Zhihui Sun;Longlong Chang;Jiarui Wu;Xin Lu","doi":"10.1109/TITS.2025.3559941","DOIUrl":null,"url":null,"abstract":"The problem of the Flexible Pickup and Delivery Services Problem (FPDSP) arises from the actual needs of multi-warehouse management strategies and is one of the key challenges in the current urban distribution logistics industry. The problem aims to quickly calculate the route planning in complex scenarios to ensure that the total traveling time of the vehicle is minimized while meeting the time window requirements. To address this problem, we propose a deep reinforcement learning method based on the Actor-Critic algorithm to quickly calculate the approximate optimal solution of FPDSP. Specifically, we propose a Transformer Model with Parallel Encoders (TMPE). The model efficiently extracts order features through parallel encoders and then uses serial decoders to optimize the fusion of feature information to optimize the order selection process. In addition, we designed a reward function to reduce the number of repeated pickups made by the vehicle at the same consignor’s location between different orders, thereby effectively reducing the vehicle’s total travel time. Experimental results show that our method can quickly find feasible solutions to the problem compared with heuristic methods on seven different datasets. At the same time, compared with all baseline methods, the number of optimal solutions of our method reaches 14, which significantly improves the problem-solving ability. This result provides a new solution for optimizing pickup and delivery logistics in multiple warehouses in cities in the future.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 6","pages":"7640-7654"},"PeriodicalIF":8.4000,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Rapid Solution for Flexible Pickup and Delivery Services Problem Based on Improved Actor-Critic Deep Reinforcement Learning\",\"authors\":\"Ran Tian;Zhihui Sun;Longlong Chang;Jiarui Wu;Xin Lu\",\"doi\":\"10.1109/TITS.2025.3559941\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The problem of the Flexible Pickup and Delivery Services Problem (FPDSP) arises from the actual needs of multi-warehouse management strategies and is one of the key challenges in the current urban distribution logistics industry. The problem aims to quickly calculate the route planning in complex scenarios to ensure that the total traveling time of the vehicle is minimized while meeting the time window requirements. To address this problem, we propose a deep reinforcement learning method based on the Actor-Critic algorithm to quickly calculate the approximate optimal solution of FPDSP. Specifically, we propose a Transformer Model with Parallel Encoders (TMPE). The model efficiently extracts order features through parallel encoders and then uses serial decoders to optimize the fusion of feature information to optimize the order selection process. In addition, we designed a reward function to reduce the number of repeated pickups made by the vehicle at the same consignor’s location between different orders, thereby effectively reducing the vehicle’s total travel time. Experimental results show that our method can quickly find feasible solutions to the problem compared with heuristic methods on seven different datasets. At the same time, compared with all baseline methods, the number of optimal solutions of our method reaches 14, which significantly improves the problem-solving ability. This result provides a new solution for optimizing pickup and delivery logistics in multiple warehouses in cities in the future.\",\"PeriodicalId\":13416,\"journal\":{\"name\":\"IEEE Transactions on Intelligent Transportation Systems\",\"volume\":\"26 6\",\"pages\":\"7640-7654\"},\"PeriodicalIF\":8.4000,\"publicationDate\":\"2025-04-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Intelligent Transportation Systems\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10972166/\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, CIVIL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Intelligent Transportation Systems","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10972166/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CIVIL","Score":null,"Total":0}
Rapid Solution for Flexible Pickup and Delivery Services Problem Based on Improved Actor-Critic Deep Reinforcement Learning
The problem of the Flexible Pickup and Delivery Services Problem (FPDSP) arises from the actual needs of multi-warehouse management strategies and is one of the key challenges in the current urban distribution logistics industry. The problem aims to quickly calculate the route planning in complex scenarios to ensure that the total traveling time of the vehicle is minimized while meeting the time window requirements. To address this problem, we propose a deep reinforcement learning method based on the Actor-Critic algorithm to quickly calculate the approximate optimal solution of FPDSP. Specifically, we propose a Transformer Model with Parallel Encoders (TMPE). The model efficiently extracts order features through parallel encoders and then uses serial decoders to optimize the fusion of feature information to optimize the order selection process. In addition, we designed a reward function to reduce the number of repeated pickups made by the vehicle at the same consignor’s location between different orders, thereby effectively reducing the vehicle’s total travel time. Experimental results show that our method can quickly find feasible solutions to the problem compared with heuristic methods on seven different datasets. At the same time, compared with all baseline methods, the number of optimal solutions of our method reaches 14, which significantly improves the problem-solving ability. This result provides a new solution for optimizing pickup and delivery logistics in multiple warehouses in cities in the future.
期刊介绍:
The theoretical, experimental and operational aspects of electrical and electronics engineering and information technologies as applied to Intelligent Transportation Systems (ITS). Intelligent Transportation Systems are defined as those systems utilizing synergistic technologies and systems engineering concepts to develop and improve transportation systems of all kinds. The scope of this interdisciplinary activity includes the promotion, consolidation and coordination of ITS technical activities among IEEE entities, and providing a focus for cooperative activities, both internally and externally.