用质量多样性克服欺骗性奖励

Proceedings of the Companion Conference on Genetic and Evolutionary Computation Pub Date : 2023-07-15 DOI:10.1145/3583133.3590741

A. Feiden, J. Garcke

{"title":"用质量多样性克服欺骗性奖励","authors":"A. Feiden, J. Garcke","doi":"10.1145/3583133.3590741","DOIUrl":null,"url":null,"abstract":"Quality-Diversity offers powerful ideas to create diverse, high-performing populations. Here, we investigate the capabilities these ideas hold to solve exploration-hard single-objective problems, in addition to creating diverse high-performing populations. We find that MAP-Elites is well suited to overcome deceptive reward structures, while an Elites-type approach with an unstructured, distance based container and extinction events can even outperform it. Furthermore, we analyse how the QD score, the standard evaluation of MAP-Elites type algorithms, is not well suited to predict the success of a configuration in solving a maze. This shows that the exploration capacity is an entirely different dimension in which QD algorithms can be utilized, evaluated, and improved on. It is a dimension that does not currently seem to be covered, implicitly or explicitly, by the current advances in the field.","PeriodicalId":422029,"journal":{"name":"Proceedings of the Companion Conference on Genetic and Evolutionary Computation","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Overcoming Deceptive Rewards with Quality-Diversity\",\"authors\":\"A. Feiden, J. Garcke\",\"doi\":\"10.1145/3583133.3590741\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Quality-Diversity offers powerful ideas to create diverse, high-performing populations. Here, we investigate the capabilities these ideas hold to solve exploration-hard single-objective problems, in addition to creating diverse high-performing populations. We find that MAP-Elites is well suited to overcome deceptive reward structures, while an Elites-type approach with an unstructured, distance based container and extinction events can even outperform it. Furthermore, we analyse how the QD score, the standard evaluation of MAP-Elites type algorithms, is not well suited to predict the success of a configuration in solving a maze. This shows that the exploration capacity is an entirely different dimension in which QD algorithms can be utilized, evaluated, and improved on. It is a dimension that does not currently seem to be covered, implicitly or explicitly, by the current advances in the field.\",\"PeriodicalId\":422029,\"journal\":{\"name\":\"Proceedings of the Companion Conference on Genetic and Evolutionary Computation\",\"volume\":\"32 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Companion Conference on Genetic and Evolutionary Computation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3583133.3590741\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Companion Conference on Genetic and Evolutionary Computation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3583133.3590741","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

质量多样性为创造多样化、高绩效的群体提供了强有力的思路。在这里，我们研究了这些想法在解决难以探索的单目标问题方面的能力，以及创造多样化的高绩效群体。我们发现MAP-Elites非常适合克服欺骗性的奖励结构，而带有非结构化、基于距离的容器和灭绝事件的精英型方法甚至可以优于它。此外，我们分析了QD分数(MAP-Elites类型算法的标准评价)如何不能很好地预测一个配置在解决迷宫中的成功。这表明，勘探能力是一个完全不同的维度，QD算法可以在其中得到利用、评估和改进。目前该领域的进展似乎并未含蓄或明确地涵盖这一方面。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Overcoming Deceptive Rewards with Quality-Diversity

Quality-Diversity offers powerful ideas to create diverse, high-performing populations. Here, we investigate the capabilities these ideas hold to solve exploration-hard single-objective problems, in addition to creating diverse high-performing populations. We find that MAP-Elites is well suited to overcome deceptive reward structures, while an Elites-type approach with an unstructured, distance based container and extinction events can even outperform it. Furthermore, we analyse how the QD score, the standard evaluation of MAP-Elites type algorithms, is not well suited to predict the success of a configuration in solving a maze. This shows that the exploration capacity is an entirely different dimension in which QD algorithms can be utilized, evaluated, and improved on. It is a dimension that does not currently seem to be covered, implicitly or explicitly, by the current advances in the field.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the Companion Conference on Genetic and Evolutionary Computation

自引率

0.00%

发文量