Reinforcement Learning for Solving Colored Traveling Salesman Problems: An Entropy-Insensitive Attention Approach

IEEE transactions on artificial intelligence Pub Date : 2024-09-19 DOI:10.1109/TAI.2024.3461630

Tianyu Zhu;Xinli Shi;Xiangping Xu;Jinde Cao

引用次数: 0

Abstract

The utilization of neural network models for solving combinatorial optimization problems (COPs) has gained significant attention in recent years and has demonstrated encouraging outcomes in addressing analogous problems such as the traveling salesman problem (TSP). The multiple TSP (MTSP) has sparked the interest of researchers as a special kind of COPs. The colored TSP (CTSP) is a variation of the MTSP, which utilizes colors to distinguish the accessibility of cities to salesmen. This article proposes a gated entropy-insensitive attention model (GEIAM) to solve CTSP. In specific, the original problem is first modeled as a sequence and preprocessed by the problem feature extraction network of the model, and then solved by the autoregressive solution constructor subsequently. The policy (parameters of the neural network model) is trained via reinforcement learning (RL). The proposed approach is compared with several commercial solvers as well as heuristics and demonstrates superior solving speed with comparable solution quality.

查看原文本刊更多论文

解决彩色旅行推销员问题的强化学习：对熵不敏感的注意力方法

近年来，利用神经网络模型求解组合优化问题（cop）得到了广泛的关注，并在解决旅行商问题（TSP）等类似问题方面取得了令人鼓舞的成果。多重TSP （MTSP）作为一种特殊类型的cop引起了研究人员的兴趣。彩色TSP （CTSP）是MTSP的变体，它利用颜色来区分城市对销售人员的可达性。本文提出了一种门控熵不敏感注意力模型（GEIAM）来解决CTSP问题。具体而言，首先将原始问题建模为序列，通过模型的问题特征提取网络进行预处理，然后使用自回归解构造器进行求解。策略（神经网络模型的参数）通过强化学习（RL）进行训练。将该方法与几种商业求解器以及启发式算法进行了比较，结果表明该方法具有较高的求解速度和相当的解质量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on artificial intelligence

CiteScore

7.70

自引率

0.00%

发文量