Using Decomposed Error for Reproducing Implicit Understanding of Algorithms

IF 3.4 2区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Evolutionary Computation Pub Date : 2024-03-01 DOI:10.1162/evco_a_00321

Caitlin A. Owen;Grant Dick;Peter A. Whigham

{"title":"Using Decomposed Error for Reproducing Implicit Understanding of Algorithms","authors":"Caitlin A. Owen;Grant Dick;Peter A. Whigham","doi":"10.1162/evco_a_00321","DOIUrl":null,"url":null,"abstract":"Reproducibility is important for having confidence in evolutionary machine learning algorithms. Although the focus of reproducibility is usually to recreate an aggregate prediction error score using fixed random seeds, this is not sufficient. Firstly, multiple runs of an algorithm, without a fixed random seed, should ideally return statistically equivalent results. Secondly, it should be confirmed whether the expected behaviour of an algorithm matches its actual behaviour, in terms of how an algorithm targets a reduction in prediction error. Confirming the behaviour of an algorithm is not possible when using a total error aggregate score. Using an error decomposition framework as a methodology for improving the reproducibility of results in evolutionary computation addresses both of these factors. By estimating decomposed error using multiple runs of an algorithm and multiple training sets, the framework provides a greater degree of certainty about the prediction error. Also, decomposing error into bias, variance due to the algorithm (internal variance), and variance due to the training data (external variance) more fully characterises evolutionary algorithms. This allows the behaviour of an algorithm to be confirmed. Applying the framework to a number of evolutionary algorithms shows that their expected behaviour can be different to their actual behaviour. Identifying a behaviour mismatch is important in terms of understanding how to further refine an algorithm as well as how to effectively apply an algorithm to a problem.","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":"32 1","pages":"49-68"},"PeriodicalIF":3.4000,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Evolutionary Computation","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10902656/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Reproducibility is important for having confidence in evolutionary machine learning algorithms. Although the focus of reproducibility is usually to recreate an aggregate prediction error score using fixed random seeds, this is not sufficient. Firstly, multiple runs of an algorithm, without a fixed random seed, should ideally return statistically equivalent results. Secondly, it should be confirmed whether the expected behaviour of an algorithm matches its actual behaviour, in terms of how an algorithm targets a reduction in prediction error. Confirming the behaviour of an algorithm is not possible when using a total error aggregate score. Using an error decomposition framework as a methodology for improving the reproducibility of results in evolutionary computation addresses both of these factors. By estimating decomposed error using multiple runs of an algorithm and multiple training sets, the framework provides a greater degree of certainty about the prediction error. Also, decomposing error into bias, variance due to the algorithm (internal variance), and variance due to the training data (external variance) more fully characterises evolutionary algorithms. This allows the behaviour of an algorithm to be confirmed. Applying the framework to a number of evolutionary algorithms shows that their expected behaviour can be different to their actual behaviour. Identifying a behaviour mismatch is important in terms of understanding how to further refine an algorithm as well as how to effectively apply an algorithm to a problem.

查看原文本刊更多论文

利用分解错误重现对算法的隐性理解。

可重复性对于建立对进化机器学习算法的信心非常重要。尽管可重复性的重点通常是使用固定的随机种子重新生成一个总的预测误差分数，但这还不够。首先，理想情况下，在没有固定随机种子的情况下，算法的多次运行应在统计上得到相同的结果。其次，应从算法如何减少预测误差的角度，确认算法的预期行为是否与实际行为相符。如果使用总误差综合得分，则无法确认算法的行为。使用误差分解框架作为提高进化计算结果可重复性的方法，可以解决上述两个问题。通过使用算法的多次运行和多个训练集来估算分解误差，该框架可提供更高的预测误差确定性。此外，将误差分解为偏差、算法引起的方差（内部方差）和训练数据引起的方差（外部方差），可以更全面地描述进化算法的特征。这样就可以确认算法的行为。将该框架应用于一些进化算法后发现，它们的预期行为可能与实际行为不同。识别行为不匹配对于理解如何进一步完善算法以及如何有效地将算法应用于问题非常重要。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Evolutionary Computation 工程技术-计算机：理论方法

CiteScore

6.40

自引率

1.50%

发文量

审稿时长

3 months

期刊介绍： Evolutionary Computation is a leading journal in its field. It provides an international forum for facilitating and enhancing the exchange of information among researchers involved in both the theoretical and practical aspects of computational systems drawing their inspiration from nature, with particular emphasis on evolutionary models of computation such as genetic algorithms, evolutionary strategies, classifier systems, evolutionary programming, and genetic programming. It welcomes articles from related fields such as swarm intelligence (e.g. Ant Colony Optimization and Particle Swarm Optimization), and other nature-inspired computation paradigms (e.g. Artificial Immune Systems). As well as publishing articles describing theoretical and/or experimental work, the journal also welcomes application-focused papers describing breakthrough results in an application domain or methodological papers where the specificities of the real-world problem led to significant algorithmic improvements that could possibly be generalized to other areas.