基于退火线性标化的多目标多臂强盗算法

2015 IEEE Congress on Evolutionary Computation (CEC) Pub Date : 2015-05-25 DOI:10.1109/CEC.2015.7257097

Saba Q. Yahyaa, Mădălina M. Drugan, B. Manderick

{"title":"基于退火线性标化的多目标多臂强盗算法","authors":"Saba Q. Yahyaa, Mădălina M. Drugan, B. Manderick","doi":"10.1109/CEC.2015.7257097","DOIUrl":null,"url":null,"abstract":"A stochastic multi-objective multi-armed bandit problem is a particular type of multi-objective (MO) optimization problems where the goal is to find and play fairly the optimal arms. To solve the multi-objective optimization problem, we propose annealing linear scalarized algorithm that transforms the MO optimization problem into a single one by using a linear scalarization function, and finds and plays fairly the optimal arms by using a decaying parameter εt. We compare empirically linear scalarized-UCB1 algorithm with the annealing linear scalarized algorithm on a test suit of multi-objective multi-armed bandit problems with independent Bernoulli distributions using different approaches to define weight sets. We used the standard approach, the adaptive approach and the genetic approach. We conclude that the performance of the annealing scalarized and the scalarized UCB1 algorithms depend on the used weight approach.","PeriodicalId":403666,"journal":{"name":"2015 IEEE Congress on Evolutionary Computation (CEC)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Annealing linear scalarized based multi-objective multi-armed bandit algorithm\",\"authors\":\"Saba Q. Yahyaa, Mădălina M. Drugan, B. Manderick\",\"doi\":\"10.1109/CEC.2015.7257097\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A stochastic multi-objective multi-armed bandit problem is a particular type of multi-objective (MO) optimization problems where the goal is to find and play fairly the optimal arms. To solve the multi-objective optimization problem, we propose annealing linear scalarized algorithm that transforms the MO optimization problem into a single one by using a linear scalarization function, and finds and plays fairly the optimal arms by using a decaying parameter εt. We compare empirically linear scalarized-UCB1 algorithm with the annealing linear scalarized algorithm on a test suit of multi-objective multi-armed bandit problems with independent Bernoulli distributions using different approaches to define weight sets. We used the standard approach, the adaptive approach and the genetic approach. We conclude that the performance of the annealing scalarized and the scalarized UCB1 algorithms depend on the used weight approach.\",\"PeriodicalId\":403666,\"journal\":{\"name\":\"2015 IEEE Congress on Evolutionary Computation (CEC)\",\"volume\":\"76 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-05-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE Congress on Evolutionary Computation (CEC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CEC.2015.7257097\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE Congress on Evolutionary Computation (CEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CEC.2015.7257097","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

随机多目标多臂强盗问题是一种特殊类型的多目标优化问题，其目标是找到并公平地发挥最优臂。为了解决多目标优化问题，我们提出了退火线性标量化算法，利用线性标量化函数将MO优化问题转化为单个优化问题，并利用衰减参数εt找到并公平发挥最优臂。在具有独立伯努利分布的多目标多臂强盗问题的测试集上，采用不同的方法定义权集，比较了经验线性标化ucb1算法与退火线性标化算法。我们使用了标准方法，适应性方法和遗传方法。我们得出结论，退火标量化和标量化UCB1算法的性能取决于所使用的权重方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Annealing linear scalarized based multi-objective multi-armed bandit algorithm

A stochastic multi-objective multi-armed bandit problem is a particular type of multi-objective (MO) optimization problems where the goal is to find and play fairly the optimal arms. To solve the multi-objective optimization problem, we propose annealing linear scalarized algorithm that transforms the MO optimization problem into a single one by using a linear scalarization function, and finds and plays fairly the optimal arms by using a decaying parameter εt. We compare empirically linear scalarized-UCB1 algorithm with the annealing linear scalarized algorithm on a test suit of multi-objective multi-armed bandit problems with independent Bernoulli distributions using different approaches to define weight sets. We used the standard approach, the adaptive approach and the genetic approach. We conclude that the performance of the annealing scalarized and the scalarized UCB1 algorithms depend on the used weight approach.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2015 IEEE Congress on Evolutionary Computation (CEC)

自引率

0.00%

发文量