Qidong Liu , Xin Shen , Chaoyue Liu , Dong Chen , Xin Zhou , Mingliang Xu
{"title":"通过多次教师强制知识蒸馏提高二维数组指针网络的泛化能力","authors":"Qidong Liu , Xin Shen , Chaoyue Liu , Dong Chen , Xin Zhou , Mingliang Xu","doi":"10.1016/j.jai.2024.12.007","DOIUrl":null,"url":null,"abstract":"<div><div>The Heterogeneous Capacitated Vehicle Routing Problem (HCVRP), which involves efficiently routing vehicles with diverse capacities to fulfill various customer demands at minimal cost, poses an NP-hard challenge in combinatorial optimization. Recently, reinforcement learning approaches such as 2D Array Pointer Networks (2D-Ptr) have demonstrated remarkable speed in decision-making by modeling multiple agents’ concurrent choices as a sequence of consecutive actions. However, these learning-based models often struggle with generalization, meaning they cannot seamlessly adapt to new scenarios with varying numbers of vehicles or customers without retraining. Inspired by the potential of multi-teacher knowledge distillation to harness diverse knowledge from multiple sources and craft a comprehensive student model, we propose to enhance the generalization capability of 2D-Ptr through Multiple Teacher-forcing Knowledge Distillation (MTKD). We initially train 12 unique 2D-Ptr models under various settings to serve as teacher models. Subsequently, we randomly sample a teacher model and a batch of problem instances, focusing on those where the chosen teacher performed best. This teacher model then solves these instances, generating high-reward action sequences to guide knowledge transfer to the student model. We conduct rigorous evaluations across four distinct datasets, each comprising four HCVRP instances of varying scales. Our empirical findings underscore the proposed method superiority over existing learning-based methods in terms of both computational efficiency and solution quality.</div></div>","PeriodicalId":100755,"journal":{"name":"Journal of Automation and Intelligence","volume":"4 1","pages":"Pages 29-38"},"PeriodicalIF":0.0000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhancing the generalization capability of 2D array pointer networks through multiple teacher-forcing knowledge distillation\",\"authors\":\"Qidong Liu , Xin Shen , Chaoyue Liu , Dong Chen , Xin Zhou , Mingliang Xu\",\"doi\":\"10.1016/j.jai.2024.12.007\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The Heterogeneous Capacitated Vehicle Routing Problem (HCVRP), which involves efficiently routing vehicles with diverse capacities to fulfill various customer demands at minimal cost, poses an NP-hard challenge in combinatorial optimization. Recently, reinforcement learning approaches such as 2D Array Pointer Networks (2D-Ptr) have demonstrated remarkable speed in decision-making by modeling multiple agents’ concurrent choices as a sequence of consecutive actions. However, these learning-based models often struggle with generalization, meaning they cannot seamlessly adapt to new scenarios with varying numbers of vehicles or customers without retraining. Inspired by the potential of multi-teacher knowledge distillation to harness diverse knowledge from multiple sources and craft a comprehensive student model, we propose to enhance the generalization capability of 2D-Ptr through Multiple Teacher-forcing Knowledge Distillation (MTKD). We initially train 12 unique 2D-Ptr models under various settings to serve as teacher models. Subsequently, we randomly sample a teacher model and a batch of problem instances, focusing on those where the chosen teacher performed best. This teacher model then solves these instances, generating high-reward action sequences to guide knowledge transfer to the student model. We conduct rigorous evaluations across four distinct datasets, each comprising four HCVRP instances of varying scales. Our empirical findings underscore the proposed method superiority over existing learning-based methods in terms of both computational efficiency and solution quality.</div></div>\",\"PeriodicalId\":100755,\"journal\":{\"name\":\"Journal of Automation and Intelligence\",\"volume\":\"4 1\",\"pages\":\"Pages 29-38\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Automation and Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2949855425000012\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Automation and Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949855425000012","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Enhancing the generalization capability of 2D array pointer networks through multiple teacher-forcing knowledge distillation
The Heterogeneous Capacitated Vehicle Routing Problem (HCVRP), which involves efficiently routing vehicles with diverse capacities to fulfill various customer demands at minimal cost, poses an NP-hard challenge in combinatorial optimization. Recently, reinforcement learning approaches such as 2D Array Pointer Networks (2D-Ptr) have demonstrated remarkable speed in decision-making by modeling multiple agents’ concurrent choices as a sequence of consecutive actions. However, these learning-based models often struggle with generalization, meaning they cannot seamlessly adapt to new scenarios with varying numbers of vehicles or customers without retraining. Inspired by the potential of multi-teacher knowledge distillation to harness diverse knowledge from multiple sources and craft a comprehensive student model, we propose to enhance the generalization capability of 2D-Ptr through Multiple Teacher-forcing Knowledge Distillation (MTKD). We initially train 12 unique 2D-Ptr models under various settings to serve as teacher models. Subsequently, we randomly sample a teacher model and a batch of problem instances, focusing on those where the chosen teacher performed best. This teacher model then solves these instances, generating high-reward action sequences to guide knowledge transfer to the student model. We conduct rigorous evaluations across four distinct datasets, each comprising four HCVRP instances of varying scales. Our empirical findings underscore the proposed method superiority over existing learning-based methods in terms of both computational efficiency and solution quality.