旅行商问题的多目标优化:基于迁移学习的深度强化学习算法

Le-yang Gao;Rui Wang;Zhao-hong Jia;Chuang Liu
{"title":"旅行商问题的多目标优化:基于迁移学习的深度强化学习算法","authors":"Le-yang Gao;Rui Wang;Zhao-hong Jia;Chuang Liu","doi":"10.1109/TAI.2024.3499946","DOIUrl":null,"url":null,"abstract":"A wide range of real applications can be modelled as the multiobjective traveling salesman problem (MOTSP), one of typical combinatorial optimization problems. Meta-heuristics can be used to address MOTSP. However, due to involving iteratively searching large solution space, they often entail significant computation time. Recently, deep reinforcement learning (DRL) algorithms have been employed in generating approximate optimal solutions to the single objective traveling salesman problems, as well as MOTSPs. This study proposes a multiobjective optimization algorithm based on DRL, called multiobjective pointer network (MOPN), where the input structure of the pointer network is redesigned to be applied to MOTSP. Furthermore, a training strategy utilizing a representative model and transfer learning is introduced to enhance the performance of MOPN. The proposed MOPN is insensitive to problem scale, meaning that a trained MOPN can address MOTSPs with different scales. Compared to meta-heuristics, MOPN takes much less time on forward propagation to obtain the pareto front. To verify the performance of our model, extensive experiments are conducted on three different MOTSPs to compare the MOPN with two state-of-the-art DRL models and two multiobjective meta-heuristics. Experimental results demonstrate that the proposed MOPN obtains the best solution with the least training time among all the compared DRL methods.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 4","pages":"896-908"},"PeriodicalIF":0.0000,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multiobjective Optimization for Traveling Salesman Problem: A Deep Reinforcement Learning Algorithm via Transfer Learning\",\"authors\":\"Le-yang Gao;Rui Wang;Zhao-hong Jia;Chuang Liu\",\"doi\":\"10.1109/TAI.2024.3499946\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A wide range of real applications can be modelled as the multiobjective traveling salesman problem (MOTSP), one of typical combinatorial optimization problems. Meta-heuristics can be used to address MOTSP. However, due to involving iteratively searching large solution space, they often entail significant computation time. Recently, deep reinforcement learning (DRL) algorithms have been employed in generating approximate optimal solutions to the single objective traveling salesman problems, as well as MOTSPs. This study proposes a multiobjective optimization algorithm based on DRL, called multiobjective pointer network (MOPN), where the input structure of the pointer network is redesigned to be applied to MOTSP. Furthermore, a training strategy utilizing a representative model and transfer learning is introduced to enhance the performance of MOPN. The proposed MOPN is insensitive to problem scale, meaning that a trained MOPN can address MOTSPs with different scales. Compared to meta-heuristics, MOPN takes much less time on forward propagation to obtain the pareto front. To verify the performance of our model, extensive experiments are conducted on three different MOTSPs to compare the MOPN with two state-of-the-art DRL models and two multiobjective meta-heuristics. Experimental results demonstrate that the proposed MOPN obtains the best solution with the least training time among all the compared DRL methods.\",\"PeriodicalId\":73305,\"journal\":{\"name\":\"IEEE transactions on artificial intelligence\",\"volume\":\"6 4\",\"pages\":\"896-908\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-11-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on artificial intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10754652/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10754652/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

多目标旅行商问题(MOTSP)是典型的组合优化问题之一,具有广泛的实际应用。元启发式可以用来解决MOTSP问题。然而,由于涉及迭代搜索大的解空间,它们往往需要大量的计算时间。近年来,深度强化学习(DRL)算法被用于生成单目标旅行商问题的近似最优解,以及motsp问题。本研究提出了一种基于DRL的多目标优化算法,称为多目标指针网络(multiobjective pointer network, MOPN),其中对指针网络的输入结构进行了重新设计,使其适用于MOTSP。在此基础上,提出了一种利用代表性模型和迁移学习相结合的训练策略来提高MOPN的性能。提出的MOPN对问题规模不敏感,这意味着经过训练的MOPN可以处理不同规模的mosp。与元启发式算法相比,MOPN在前向传播上花费的时间要少得多。为了验证我们的模型的性能,我们在三个不同的motsp上进行了大量的实验,将MOPN与两个最先进的DRL模型和两个多目标元启发式模型进行了比较。实验结果表明,在所有的DRL方法中,所提出的MOPN以最少的训练时间获得了最佳解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Multiobjective Optimization for Traveling Salesman Problem: A Deep Reinforcement Learning Algorithm via Transfer Learning
A wide range of real applications can be modelled as the multiobjective traveling salesman problem (MOTSP), one of typical combinatorial optimization problems. Meta-heuristics can be used to address MOTSP. However, due to involving iteratively searching large solution space, they often entail significant computation time. Recently, deep reinforcement learning (DRL) algorithms have been employed in generating approximate optimal solutions to the single objective traveling salesman problems, as well as MOTSPs. This study proposes a multiobjective optimization algorithm based on DRL, called multiobjective pointer network (MOPN), where the input structure of the pointer network is redesigned to be applied to MOTSP. Furthermore, a training strategy utilizing a representative model and transfer learning is introduced to enhance the performance of MOPN. The proposed MOPN is insensitive to problem scale, meaning that a trained MOPN can address MOTSPs with different scales. Compared to meta-heuristics, MOPN takes much less time on forward propagation to obtain the pareto front. To verify the performance of our model, extensive experiments are conducted on three different MOTSPs to compare the MOPN with two state-of-the-art DRL models and two multiobjective meta-heuristics. Experimental results demonstrate that the proposed MOPN obtains the best solution with the least training time among all the compared DRL methods.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
7.70
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信