Multiobjective Optimization for Traveling Salesman Problem: A Deep Reinforcement Learning Algorithm via Transfer Learning

IEEE transactions on artificial intelligence Pub Date : 2024-11-15 DOI:10.1109/TAI.2024.3499946

Le-yang Gao;Rui Wang;Zhao-hong Jia;Chuang Liu

{"title":"Multiobjective Optimization for Traveling Salesman Problem: A Deep Reinforcement Learning Algorithm via Transfer Learning","authors":"Le-yang Gao;Rui Wang;Zhao-hong Jia;Chuang Liu","doi":"10.1109/TAI.2024.3499946","DOIUrl":null,"url":null,"abstract":"A wide range of real applications can be modelled as the multiobjective traveling salesman problem (MOTSP), one of typical combinatorial optimization problems. Meta-heuristics can be used to address MOTSP. However, due to involving iteratively searching large solution space, they often entail significant computation time. Recently, deep reinforcement learning (DRL) algorithms have been employed in generating approximate optimal solutions to the single objective traveling salesman problems, as well as MOTSPs. This study proposes a multiobjective optimization algorithm based on DRL, called multiobjective pointer network (MOPN), where the input structure of the pointer network is redesigned to be applied to MOTSP. Furthermore, a training strategy utilizing a representative model and transfer learning is introduced to enhance the performance of MOPN. The proposed MOPN is insensitive to problem scale, meaning that a trained MOPN can address MOTSPs with different scales. Compared to meta-heuristics, MOPN takes much less time on forward propagation to obtain the pareto front. To verify the performance of our model, extensive experiments are conducted on three different MOTSPs to compare the MOPN with two state-of-the-art DRL models and two multiobjective meta-heuristics. Experimental results demonstrate that the proposed MOPN obtains the best solution with the least training time among all the compared DRL methods.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 4","pages":"896-908"},"PeriodicalIF":0.0000,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10754652/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

A wide range of real applications can be modelled as the multiobjective traveling salesman problem (MOTSP), one of typical combinatorial optimization problems. Meta-heuristics can be used to address MOTSP. However, due to involving iteratively searching large solution space, they often entail significant computation time. Recently, deep reinforcement learning (DRL) algorithms have been employed in generating approximate optimal solutions to the single objective traveling salesman problems, as well as MOTSPs. This study proposes a multiobjective optimization algorithm based on DRL, called multiobjective pointer network (MOPN), where the input structure of the pointer network is redesigned to be applied to MOTSP. Furthermore, a training strategy utilizing a representative model and transfer learning is introduced to enhance the performance of MOPN. The proposed MOPN is insensitive to problem scale, meaning that a trained MOPN can address MOTSPs with different scales. Compared to meta-heuristics, MOPN takes much less time on forward propagation to obtain the pareto front. To verify the performance of our model, extensive experiments are conducted on three different MOTSPs to compare the MOPN with two state-of-the-art DRL models and two multiobjective meta-heuristics. Experimental results demonstrate that the proposed MOPN obtains the best solution with the least training time among all the compared DRL methods.

查看原文本刊更多论文

旅行商问题的多目标优化：基于迁移学习的深度强化学习算法

多目标旅行商问题（MOTSP）是典型的组合优化问题之一，具有广泛的实际应用。元启发式可以用来解决MOTSP问题。然而，由于涉及迭代搜索大的解空间，它们往往需要大量的计算时间。近年来，深度强化学习（DRL）算法被用于生成单目标旅行商问题的近似最优解，以及motsp问题。本研究提出了一种基于DRL的多目标优化算法，称为多目标指针网络（multiobjective pointer network， MOPN），其中对指针网络的输入结构进行了重新设计，使其适用于MOTSP。在此基础上，提出了一种利用代表性模型和迁移学习相结合的训练策略来提高MOPN的性能。提出的MOPN对问题规模不敏感，这意味着经过训练的MOPN可以处理不同规模的mosp。与元启发式算法相比，MOPN在前向传播上花费的时间要少得多。为了验证我们的模型的性能，我们在三个不同的motsp上进行了大量的实验，将MOPN与两个最先进的DRL模型和两个多目标元启发式模型进行了比较。实验结果表明，在所有的DRL方法中，所提出的MOPN以最少的训练时间获得了最佳解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on artificial intelligence

CiteScore

7.70

自引率

0.00%

发文量