Learning feature spaces for regression with genetic programming.

IF 0.9 3区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Genetic Programming and Evolvable Machines Pub Date : 2020-09-01 Epub Date: 2020-03-11 DOI:10.1007/s10710-020-09383-4

William La Cava, Jason H Moore

{"title":"Learning feature spaces for regression with genetic programming.","authors":"William La Cava, Jason H Moore","doi":"10.1007/s10710-020-09383-4","DOIUrl":null,"url":null,"abstract":"<p><p>Genetic programming has found recent success as a tool for learning sets of features for regression and classification. Multidimensional genetic programming is a useful variant of genetic programming for this task because it represents candidate solutions as sets of programs. These sets of programs expose additional information that can be exploited for building block identification. In this work, we discuss this architecture and others in terms of their propensity for allowing heuristic search to utilize information during the evolutionary process. We investigate methods for biasing the components of programs that are promoted in order to guide search towards useful and complementary feature spaces. We study two main approaches: 1) the introduction of new objectives and 2) the use of specialized semantic variation operators. We find that a semantic crossover operator based on stagewise regression leads to significant improvements on a set of regression problems. The inclusion of semantic crossover produces state-of-the-art results in a large benchmark study of open-source regression problems in comparison to several state-of-the-art machine learning approaches and other genetic programming frameworks. Finally, we look at the collinearity and complexity of the data representations produced by different methods, in order to assess whether relevant, concise, and independent factors of variation can be produced in application.</p>","PeriodicalId":50424,"journal":{"name":"Genetic Programming and Evolvable Machines","volume":"21 3","pages":"433-467"},"PeriodicalIF":0.9000,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7748157/pdf/nihms-1575378.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genetic Programming and Evolvable Machines","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10710-020-09383-4","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2020/3/11 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Genetic programming has found recent success as a tool for learning sets of features for regression and classification. Multidimensional genetic programming is a useful variant of genetic programming for this task because it represents candidate solutions as sets of programs. These sets of programs expose additional information that can be exploited for building block identification. In this work, we discuss this architecture and others in terms of their propensity for allowing heuristic search to utilize information during the evolutionary process. We investigate methods for biasing the components of programs that are promoted in order to guide search towards useful and complementary feature spaces. We study two main approaches: 1) the introduction of new objectives and 2) the use of specialized semantic variation operators. We find that a semantic crossover operator based on stagewise regression leads to significant improvements on a set of regression problems. The inclusion of semantic crossover produces state-of-the-art results in a large benchmark study of open-source regression problems in comparison to several state-of-the-art machine learning approaches and other genetic programming frameworks. Finally, we look at the collinearity and complexity of the data representations produced by different methods, in order to assess whether relevant, concise, and independent factors of variation can be produced in application.

查看原文本刊更多论文

利用遗传编程学习回归特征空间。

遗传编程作为一种学习回归和分类特征集的工具，最近取得了成功。多维遗传编程是遗传编程在这项任务中的一个有用变体，因为它将候选解决方案表示为程序集。这些程序集揭示了额外的信息，可用于识别构件。在这项工作中，我们将从启发式搜索在进化过程中利用信息的倾向出发，讨论这种架构和其他架构。我们研究了对程序中被推广的部分进行偏置的方法，以引导搜索向有用和互补的特征空间发展。我们研究了两种主要方法：1）引入新目标；2）使用专门的语义变异算子。我们发现，基于阶段回归的语义交叉算子能显著改善一组回归问题。在一项大型开源回归问题基准研究中，与几种最先进的机器学习方法和其他遗传编程框架相比，语义交叉的加入产生了最先进的结果。最后，我们研究了不同方法产生的数据表示的共线性和复杂性，以评估是否能在应用中产生相关、简洁和独立的变化因素。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Genetic Programming and Evolvable Machines 工程技术-计算机：理论方法

CiteScore

5.90

自引率

3.80%

发文量

审稿时长

6 months

期刊介绍： A unique source reporting on methods for artificial evolution of programs and machines... Reports innovative and significant progress in automatic evolution of software and hardware. Features both theoretical and application papers. Covers hardware implementations, artificial life, molecular computing and emergent computation techniques. Examines such related topics as evolutionary algorithms with variable-size genomes, alternate methods of program induction, approaches to engineering systems development based on embryology, morphogenesis or other techniques inspired by adaptive natural systems.