Automated reinforcement learning for sequential ordering problem using hyperparameter optimization and metalearning

自主智能系统(英文) Pub Date : 2025-07-29 DOI:10.1007/s43684-025-00103-2

André Luiz Carvalho Ottoni

{"title":"Automated reinforcement learning for sequential ordering problem using hyperparameter optimization and metalearning","authors":"André Luiz Carvalho Ottoni","doi":"10.1007/s43684-025-00103-2","DOIUrl":null,"url":null,"abstract":"<div><p>AutoML systems seek to assist Artificial Intelligence users in finding the best configurations for machine learning models. Following this line, recently the area of Automated Reinforcement Learning (AutoRL) has become increasingly relevant, given the growing increase in applications for reinforcement learning algorithms. However, the literature still lacks specific AutoRL systems for combinatorial optimization, especially for the Sequential Ordering Problem (SOP). Therefore, this paper aims to present a new AutoRL approach for SOP. For this, two new methods are proposed using hyperparameter optimization and metalearning: AutoRL-SOP and AutoRL-SOP-MtL. The proposed AutoRL techniques enable the combined tuning of three SARSA hyperparameters, being <i>ϵ</i>-greedy policy, learning rate, and discount factor. Furthermore, the new metalearning approach enables the transfer of hyperparameters between two combinatorial optimization domains: TSP (source) and SOP (target). The results show that the application of metalearning generates a reduction in computational cost in hyperparameter optimization. Furthermore, the proposed AutoRL methods achieved the best solutions in 23 out of 28 simulated TSPLIB instances compared to recent literature studies.</p></div>","PeriodicalId":71187,"journal":{"name":"自主智能系统(英文)","volume":"5 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s43684-025-00103-2.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"自主智能系统(英文)","FirstCategoryId":"1093","ListUrlMain":"https://link.springer.com/article/10.1007/s43684-025-00103-2","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

AutoML systems seek to assist Artificial Intelligence users in finding the best configurations for machine learning models. Following this line, recently the area of Automated Reinforcement Learning (AutoRL) has become increasingly relevant, given the growing increase in applications for reinforcement learning algorithms. However, the literature still lacks specific AutoRL systems for combinatorial optimization, especially for the Sequential Ordering Problem (SOP). Therefore, this paper aims to present a new AutoRL approach for SOP. For this, two new methods are proposed using hyperparameter optimization and metalearning: AutoRL-SOP and AutoRL-SOP-MtL. The proposed AutoRL techniques enable the combined tuning of three SARSA hyperparameters, being ϵ-greedy policy, learning rate, and discount factor. Furthermore, the new metalearning approach enables the transfer of hyperparameters between two combinatorial optimization domains: TSP (source) and SOP (target). The results show that the application of metalearning generates a reduction in computational cost in hyperparameter optimization. Furthermore, the proposed AutoRL methods achieved the best solutions in 23 out of 28 simulated TSPLIB instances compared to recent literature studies.

查看原文本刊更多论文

基于超参数优化和元学习的序列排序问题自动强化学习

AutoML系统旨在帮助人工智能用户找到机器学习模型的最佳配置。沿着这条线，鉴于强化学习算法的应用日益增加，最近自动强化学习（AutoRL）领域变得越来越相关。然而，文献中仍然缺乏针对组合优化的特定自动驾驶系统，特别是针对顺序排序问题（SOP）。因此，本文旨在为SOP提供一种新的AutoRL方法。为此，提出了两种基于超参数优化和元学习的新方法：AutoRL-SOP和AutoRL-SOP- mtl。提出的AutoRL技术能够组合调整三个SARSA超参数，即ϵ-greedy策略、学习率和折现系数。此外，新的元学习方法能够在TSP（源）和SOP（目标）两个组合优化域之间传递超参数。结果表明，元学习的应用减少了超参数优化的计算成本。此外，与最近的文献研究相比，所提出的AutoRL方法在28个模拟TSPLIB实例中的23个中获得了最佳解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

自主智能系统(英文)

CiteScore

3.90

自引率

0.00%

发文量