Effects of reducing redundant parameters in parameter optimization for symbolic regression using genetic programming

IF 0.6 4区 数学 Q4 COMPUTER SCIENCE, THEORY & METHODS
Gabriel Kronberger , Fabrício Olivetti de França
{"title":"Effects of reducing redundant parameters in parameter optimization for symbolic regression using genetic programming","authors":"Gabriel Kronberger ,&nbsp;Fabrício Olivetti de França","doi":"10.1016/j.jsc.2024.102413","DOIUrl":null,"url":null,"abstract":"<div><div>Gradient-based local optimization has been shown to improve results of genetic programming (GP) for symbolic regression (SR) – a machine learning method for symbolic equation learning. Correspondingly, several state-of-the-art GP implementations use iterative nonlinear least squares (NLS) algorithms for local optimization of parameters. An issue that has however mostly been ignored in SR and GP literature is overparameterization of SR expressions, and as a consequence, bad conditioning of NLS optimization problem. The aim of this study is to analyze the effects of overparameterization on the NLS results and convergence speed, whereby we use Operon as an example GP/SR implementation. First, we demonstrate that numeric rank approximation can be used to detect overparameterization using a set of six selected benchmark problems. In the second part, we analyze whether the NLS results or convergence speed can be improved by simplifying expressions to remove redundant parameters with equality saturation. This analysis is done with the much larger Feynman symbolic regression benchmark set after collecting all expressions visited by GP, as the simplification procedure is not fast enough to use it within GP fitness evaluation. We observe that Operon frequently visits overparameterized solutions but the number of redundant parameters is small on average. We analyzed the Pareto-optimal expressions of the first and last generation of GP, and found that for 70% to 80% of the simplified expressions, the success rate of reaching the optimum was better or equal than for the overparameterized form. The effect was smaller for the number of NLS iterations until convergence, where we found fewer or equal iterations for 51% to 63% of the expressions after simplification.</div></div>","PeriodicalId":50031,"journal":{"name":"Journal of Symbolic Computation","volume":"129 ","pages":"Article 102413"},"PeriodicalIF":0.6000,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Symbolic Computation","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0747717124001172","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Gradient-based local optimization has been shown to improve results of genetic programming (GP) for symbolic regression (SR) – a machine learning method for symbolic equation learning. Correspondingly, several state-of-the-art GP implementations use iterative nonlinear least squares (NLS) algorithms for local optimization of parameters. An issue that has however mostly been ignored in SR and GP literature is overparameterization of SR expressions, and as a consequence, bad conditioning of NLS optimization problem. The aim of this study is to analyze the effects of overparameterization on the NLS results and convergence speed, whereby we use Operon as an example GP/SR implementation. First, we demonstrate that numeric rank approximation can be used to detect overparameterization using a set of six selected benchmark problems. In the second part, we analyze whether the NLS results or convergence speed can be improved by simplifying expressions to remove redundant parameters with equality saturation. This analysis is done with the much larger Feynman symbolic regression benchmark set after collecting all expressions visited by GP, as the simplification procedure is not fast enough to use it within GP fitness evaluation. We observe that Operon frequently visits overparameterized solutions but the number of redundant parameters is small on average. We analyzed the Pareto-optimal expressions of the first and last generation of GP, and found that for 70% to 80% of the simplified expressions, the success rate of reaching the optimum was better or equal than for the overparameterized form. The effect was smaller for the number of NLS iterations until convergence, where we found fewer or equal iterations for 51% to 63% of the expressions after simplification.
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Symbolic Computation
Journal of Symbolic Computation 工程技术-计算机:理论方法
CiteScore
2.10
自引率
14.30%
发文量
75
审稿时长
142 days
期刊介绍: An international journal, the Journal of Symbolic Computation, founded by Bruno Buchberger in 1985, is directed to mathematicians and computer scientists who have a particular interest in symbolic computation. The journal provides a forum for research in the algorithmic treatment of all types of symbolic objects: objects in formal languages (terms, formulas, programs); algebraic objects (elements in basic number domains, polynomials, residue classes, etc.); and geometrical objects. It is the explicit goal of the journal to promote the integration of symbolic computation by establishing one common avenue of communication for researchers working in the different subareas. It is also important that the algorithmic achievements of these areas should be made available to the human problem-solver in integrated software systems for symbolic computation. To help this integration, the journal publishes invited tutorial surveys as well as Applications Letters and System Descriptions.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信