{"title":"A Graph Reinforcement Learning Framework for Neural Adaptive Large Neighbourhood Search","authors":"Syu-Ning Johnn , Victor-Alexandru Darvariu , Julia Handl , Jörg Kalcsics","doi":"10.1016/j.cor.2024.106791","DOIUrl":null,"url":null,"abstract":"<div><p>Adaptive Large Neighbourhood Search (ALNS) is a popular metaheuristic with renowned efficiency in solving combinatorial optimisation problems. However, despite 18 years of intensive research into ALNS, the design of an effective adaptive layer for selecting operators to improve the solution remains an open question. In this work, we isolate this problem by formulating it as a Markov Decision Process, in which an agent is rewarded proportionally to the improvement of the incumbent. We propose Graph Reinforcement Learning for Operator Selection (GRLOS), a method based on Deep Reinforcement Learning and Graph Neural Networks, as well as Learned Roulette Wheel (LRW), a lightweight approach inspired by the classic Roulette Wheel adaptive layer. The methods, which are broadly applicable to optimisation problems that can be represented as graphs, are comprehensively evaluated on 5 routing problems using a large portfolio of 28 destroy and 7 repair operators. Results show that both GRLOS and LRW outperform the classic selection mechanism in ALNS, owing to the operator choices being learned in a prior training phase. GRLOS is also shown to consistently achieve better performance than a recent Deep Reinforcement Learning method due to its substantially more flexible state representation. The evaluation further examines the impact of the operator budget and type of initial solution, and is applied to problem instances with up to 1000 customers. The findings arising from our extensive benchmarking bear relevance to the wider literature of hybrid methods combining metaheuristics and machine learning.</p></div>","PeriodicalId":10542,"journal":{"name":"Computers & Operations Research","volume":"172 ","pages":"Article 106791"},"PeriodicalIF":4.1000,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0305054824002636/pdfft?md5=03a7927599315f8e665c819357d5a172&pid=1-s2.0-S0305054824002636-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Operations Research","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0305054824002636","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Adaptive Large Neighbourhood Search (ALNS) is a popular metaheuristic with renowned efficiency in solving combinatorial optimisation problems. However, despite 18 years of intensive research into ALNS, the design of an effective adaptive layer for selecting operators to improve the solution remains an open question. In this work, we isolate this problem by formulating it as a Markov Decision Process, in which an agent is rewarded proportionally to the improvement of the incumbent. We propose Graph Reinforcement Learning for Operator Selection (GRLOS), a method based on Deep Reinforcement Learning and Graph Neural Networks, as well as Learned Roulette Wheel (LRW), a lightweight approach inspired by the classic Roulette Wheel adaptive layer. The methods, which are broadly applicable to optimisation problems that can be represented as graphs, are comprehensively evaluated on 5 routing problems using a large portfolio of 28 destroy and 7 repair operators. Results show that both GRLOS and LRW outperform the classic selection mechanism in ALNS, owing to the operator choices being learned in a prior training phase. GRLOS is also shown to consistently achieve better performance than a recent Deep Reinforcement Learning method due to its substantially more flexible state representation. The evaluation further examines the impact of the operator budget and type of initial solution, and is applied to problem instances with up to 1000 customers. The findings arising from our extensive benchmarking bear relevance to the wider literature of hybrid methods combining metaheuristics and machine learning.
期刊介绍:
Operations research and computers meet in a large number of scientific fields, many of which are of vital current concern to our troubled society. These include, among others, ecology, transportation, safety, reliability, urban planning, economics, inventory control, investment strategy and logistics (including reverse logistics). Computers & Operations Research provides an international forum for the application of computers and operations research techniques to problems in these and related fields.