通过多次教师强制知识蒸馏提高二维数组指针网络的泛化能力

Journal of Automation and Intelligence Pub Date : 2025-03-01 DOI:10.1016/j.jai.2024.12.007

Qidong Liu , Xin Shen , Chaoyue Liu , Dong Chen , Xin Zhou , Mingliang Xu

{"title":"通过多次教师强制知识蒸馏提高二维数组指针网络的泛化能力","authors":"Qidong Liu , Xin Shen , Chaoyue Liu , Dong Chen , Xin Zhou , Mingliang Xu","doi":"10.1016/j.jai.2024.12.007","DOIUrl":null,"url":null,"abstract":"<div><div>The Heterogeneous Capacitated Vehicle Routing Problem (HCVRP), which involves efficiently routing vehicles with diverse capacities to fulfill various customer demands at minimal cost, poses an NP-hard challenge in combinatorial optimization. Recently, reinforcement learning approaches such as 2D Array Pointer Networks (2D-Ptr) have demonstrated remarkable speed in decision-making by modeling multiple agents’ concurrent choices as a sequence of consecutive actions. However, these learning-based models often struggle with generalization, meaning they cannot seamlessly adapt to new scenarios with varying numbers of vehicles or customers without retraining. Inspired by the potential of multi-teacher knowledge distillation to harness diverse knowledge from multiple sources and craft a comprehensive student model, we propose to enhance the generalization capability of 2D-Ptr through Multiple Teacher-forcing Knowledge Distillation (MTKD). We initially train 12 unique 2D-Ptr models under various settings to serve as teacher models. Subsequently, we randomly sample a teacher model and a batch of problem instances, focusing on those where the chosen teacher performed best. This teacher model then solves these instances, generating high-reward action sequences to guide knowledge transfer to the student model. We conduct rigorous evaluations across four distinct datasets, each comprising four HCVRP instances of varying scales. Our empirical findings underscore the proposed method superiority over existing learning-based methods in terms of both computational efficiency and solution quality.</div></div>","PeriodicalId":100755,"journal":{"name":"Journal of Automation and Intelligence","volume":"4 1","pages":"Pages 29-38"},"PeriodicalIF":0.0000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhancing the generalization capability of 2D array pointer networks through multiple teacher-forcing knowledge distillation\",\"authors\":\"Qidong Liu , Xin Shen , Chaoyue Liu , Dong Chen , Xin Zhou , Mingliang Xu\",\"doi\":\"10.1016/j.jai.2024.12.007\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The Heterogeneous Capacitated Vehicle Routing Problem (HCVRP), which involves efficiently routing vehicles with diverse capacities to fulfill various customer demands at minimal cost, poses an NP-hard challenge in combinatorial optimization. Recently, reinforcement learning approaches such as 2D Array Pointer Networks (2D-Ptr) have demonstrated remarkable speed in decision-making by modeling multiple agents’ concurrent choices as a sequence of consecutive actions. However, these learning-based models often struggle with generalization, meaning they cannot seamlessly adapt to new scenarios with varying numbers of vehicles or customers without retraining. Inspired by the potential of multi-teacher knowledge distillation to harness diverse knowledge from multiple sources and craft a comprehensive student model, we propose to enhance the generalization capability of 2D-Ptr through Multiple Teacher-forcing Knowledge Distillation (MTKD). We initially train 12 unique 2D-Ptr models under various settings to serve as teacher models. Subsequently, we randomly sample a teacher model and a batch of problem instances, focusing on those where the chosen teacher performed best. This teacher model then solves these instances, generating high-reward action sequences to guide knowledge transfer to the student model. We conduct rigorous evaluations across four distinct datasets, each comprising four HCVRP instances of varying scales. Our empirical findings underscore the proposed method superiority over existing learning-based methods in terms of both computational efficiency and solution quality.</div></div>\",\"PeriodicalId\":100755,\"journal\":{\"name\":\"Journal of Automation and Intelligence\",\"volume\":\"4 1\",\"pages\":\"Pages 29-38\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Automation and Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2949855425000012\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Automation and Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949855425000012","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

异构能力车辆路由问题（HCVRP）是组合优化中的一个np困难问题，它涉及到以最小的成本对具有不同能力的车辆进行有效的路由，以满足不同的客户需求。最近，2D数组指针网络（2D- ptr）等强化学习方法通过将多个智能体的并发选择建模为一系列连续动作，在决策方面表现出了显著的速度。然而，这些基于学习的模型往往难以泛化，这意味着它们不能在没有再培训的情况下无缝地适应不同数量的车辆或客户的新场景。多教师强制知识蒸馏（multiple Teacher-forcing knowledge distillation， MTKD）具有利用多种来源的不同知识并构建综合学生模型的潜力，因此我们提出通过多教师强制知识蒸馏（multiple Teacher-forcing knowledge distillation， MTKD）来增强2D-Ptr的泛化能力。我们最初在各种设置下训练了12个独特的2D-Ptr模型作为教师模型。随后，我们随机抽取一个教师模型和一批问题实例，重点关注所选教师表现最好的那些。然后，教师模型解决了这些情况，生成高回报的动作序列，引导知识转移到学生模型。我们对四个不同的数据集进行了严格的评估，每个数据集包括四个不同规模的HCVRP实例。我们的实证研究结果强调了所提出的方法在计算效率和解决方案质量方面优于现有的基于学习的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Enhancing the generalization capability of 2D array pointer networks through multiple teacher-forcing knowledge distillation

The Heterogeneous Capacitated Vehicle Routing Problem (HCVRP), which involves efficiently routing vehicles with diverse capacities to fulfill various customer demands at minimal cost, poses an NP-hard challenge in combinatorial optimization. Recently, reinforcement learning approaches such as 2D Array Pointer Networks (2D-Ptr) have demonstrated remarkable speed in decision-making by modeling multiple agents’ concurrent choices as a sequence of consecutive actions. However, these learning-based models often struggle with generalization, meaning they cannot seamlessly adapt to new scenarios with varying numbers of vehicles or customers without retraining. Inspired by the potential of multi-teacher knowledge distillation to harness diverse knowledge from multiple sources and craft a comprehensive student model, we propose to enhance the generalization capability of 2D-Ptr through Multiple Teacher-forcing Knowledge Distillation (MTKD). We initially train 12 unique 2D-Ptr models under various settings to serve as teacher models. Subsequently, we randomly sample a teacher model and a batch of problem instances, focusing on those where the chosen teacher performed best. This teacher model then solves these instances, generating high-reward action sequences to guide knowledge transfer to the student model. We conduct rigorous evaluations across four distinct datasets, each comprising four HCVRP instances of varying scales. Our empirical findings underscore the proposed method superiority over existing learning-based methods in terms of both computational efficiency and solution quality.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Automation and Intelligence

自引率

0.00%

发文量