Generalisation and domain adaptation in GP with gradient descent for symbolic regression

2015 IEEE Congress on Evolutionary Computation (CEC) Pub Date : 2015-05-25 DOI:10.1109/CEC.2015.7257017

Qi Chen, Bing Xue, Mengjie Zhang

{"title":"Generalisation and domain adaptation in GP with gradient descent for symbolic regression","authors":"Qi Chen, Bing Xue, Mengjie Zhang","doi":"10.1109/CEC.2015.7257017","DOIUrl":null,"url":null,"abstract":"Genetic programming (GP) has been widely applied to symbolic regression problems and achieved good success. Gradient descent has also been used in GP as a complementary search to the genetic beam search to further improve symbolic regression performance. However, most existing GP approaches with gradient descent (GPGD) to symbolic regression have only been tested on the “conventional” symbolic regression problems such as benchmark function approximations and engineering practical problems with a single (training) data set only and the effectiveness on unseen data sets in the same domain and in different domains has not been fully investigated. This paper designs a series of experiment objectives to investigate the effectiveness and efficiency of GPGD with various settings for a set of symbolic regression problems applied to unseen data in the same domain and adapted to other domains. The results suggest that the existing GPGD method applying gradient descent to all evolved program trees three times at every generation can perform very well on the training set itself, but cannot generalise well on the unseen data set in the same domain and cannot be adapted to unseen data in an extended domain. Applying gradient descent to the best program in the final generation of GP can also improve the performance over the standard GP method and can generalise well on unseen data for some of the tasks in the same domain, but perform poorly on the unseen data in an extended domain. Applying gradient descent to the top 20% programs in the population can generalise reasonably well on the unseen data in not only the same domain but also in an extended domain.","PeriodicalId":403666,"journal":{"name":"2015 IEEE Congress on Evolutionary Computation (CEC)","volume":"78 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"38","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE Congress on Evolutionary Computation (CEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CEC.2015.7257017","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 38

Abstract

Genetic programming (GP) has been widely applied to symbolic regression problems and achieved good success. Gradient descent has also been used in GP as a complementary search to the genetic beam search to further improve symbolic regression performance. However, most existing GP approaches with gradient descent (GPGD) to symbolic regression have only been tested on the “conventional” symbolic regression problems such as benchmark function approximations and engineering practical problems with a single (training) data set only and the effectiveness on unseen data sets in the same domain and in different domains has not been fully investigated. This paper designs a series of experiment objectives to investigate the effectiveness and efficiency of GPGD with various settings for a set of symbolic regression problems applied to unseen data in the same domain and adapted to other domains. The results suggest that the existing GPGD method applying gradient descent to all evolved program trees three times at every generation can perform very well on the training set itself, but cannot generalise well on the unseen data set in the same domain and cannot be adapted to unseen data in an extended domain. Applying gradient descent to the best program in the final generation of GP can also improve the performance over the standard GP method and can generalise well on unseen data for some of the tasks in the same domain, but perform poorly on the unseen data in an extended domain. Applying gradient descent to the top 20% programs in the population can generalise reasonably well on the unseen data in not only the same domain but also in an extended domain.

查看原文本刊更多论文

符号回归梯度下降GP的概化与域自适应

遗传规划在符号回归问题中得到了广泛的应用，并取得了良好的效果。梯度下降作为遗传束搜索的补充，进一步提高了符号回归的性能。然而，大多数现有的梯度下降(GPGD)符号回归GP方法仅在“传统”符号回归问题上进行了测试，例如基准函数近似和仅使用单个(训练)数据集的工程实际问题，并且对同一域和不同域的未见数据集的有效性尚未进行充分研究。本文设计了一系列实验目标，研究了GPGD在不同设置下的有效性和效率，并将一组符号回归问题应用于同一领域的未见数据，并适应于其他领域。结果表明，现有的GPGD方法在每一代对所有进化的程序树进行三次梯度下降，可以很好地处理训练集本身，但不能很好地泛化同一域内的未知数据集，也不能适应扩展域内的未知数据。将梯度下降法应用于最后一代GP的最佳方案，也可以提高标准GP方法的性能，并且可以很好地泛化同一领域内某些任务的未见数据，但在扩展领域的未见数据上表现不佳。将梯度下降法应用于种群中排名前20%的程序，不仅可以在同一域，而且可以在扩展域对未见过的数据进行相当好的泛化。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2015 IEEE Congress on Evolutionary Computation (CEC)

自引率

0.00%

发文量