{"title":"A deep reinforcement learning method for solving Two-Echelon Location-Routing Problem","authors":"Shuo Huang , Yaoxin Wu , Zhiguang Cao , Xuexi Zhang","doi":"10.1016/j.cor.2025.107210","DOIUrl":null,"url":null,"abstract":"<div><div>In the domain of logistics and supply chain management, optimizing distribution networks is a crucial task for improving efficiency and reducing operational costs. This paper focuses on addressing the Two-Echelon Location-Routing Problem (2E-LRP), with the aim to concurrently optimize the facility (i.e., the transfer station and depot) placement, and vehicle routing for transporting goods between depots, transfer stations, and customers. We propose a method based on deep reinforcement learning to minimize the total costs associated with the operational cost of facilities, the cost of vehicle usage, and transportation cost. Specifically, we design an encoder–decoder structured two-stage attention model that constructs solutions of location-routing problems in two echelons, respectively. A simple yet effective recurrent unit is used in decoder to capture context embeddings, allowing the model to selectively incorporate beneficial information from previous construction steps. The contexts are then used for attention computation to select facilities and customers and thus determine their placements and the routes. The model is trained by REINFORCE algorithm with a shared baseline, and its performance is validated through comparisons with Gurobi solver and typical heuristic algorithms. Extensive results showcase the favorable performance of our model on both synthetic and benchmark instances, which offers a competitive alternative to traditional solutions. Specifically, our model achieves up to 1.5% cost reduction and over 99% computation time savings compared to traditional heuristic algorithms in large instance. In addition, the generalization is fairly good to cope with instances of different scales and distributions.</div></div>","PeriodicalId":10542,"journal":{"name":"Computers & Operations Research","volume":"183 ","pages":"Article 107210"},"PeriodicalIF":4.3000,"publicationDate":"2025-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Operations Research","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0305054825002382","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
In the domain of logistics and supply chain management, optimizing distribution networks is a crucial task for improving efficiency and reducing operational costs. This paper focuses on addressing the Two-Echelon Location-Routing Problem (2E-LRP), with the aim to concurrently optimize the facility (i.e., the transfer station and depot) placement, and vehicle routing for transporting goods between depots, transfer stations, and customers. We propose a method based on deep reinforcement learning to minimize the total costs associated with the operational cost of facilities, the cost of vehicle usage, and transportation cost. Specifically, we design an encoder–decoder structured two-stage attention model that constructs solutions of location-routing problems in two echelons, respectively. A simple yet effective recurrent unit is used in decoder to capture context embeddings, allowing the model to selectively incorporate beneficial information from previous construction steps. The contexts are then used for attention computation to select facilities and customers and thus determine their placements and the routes. The model is trained by REINFORCE algorithm with a shared baseline, and its performance is validated through comparisons with Gurobi solver and typical heuristic algorithms. Extensive results showcase the favorable performance of our model on both synthetic and benchmark instances, which offers a competitive alternative to traditional solutions. Specifically, our model achieves up to 1.5% cost reduction and over 99% computation time savings compared to traditional heuristic algorithms in large instance. In addition, the generalization is fairly good to cope with instances of different scales and distributions.
期刊介绍:
Operations research and computers meet in a large number of scientific fields, many of which are of vital current concern to our troubled society. These include, among others, ecology, transportation, safety, reliability, urban planning, economics, inventory control, investment strategy and logistics (including reverse logistics). Computers & Operations Research provides an international forum for the application of computers and operations research techniques to problems in these and related fields.