Ryan Boldi, Martin Briesch, Dominik Sobania, Alexander Lalejini, Thomas Helmuth, Franz Rothlauf, Charles Ofria, Lee Spector
{"title":"Informed Down-Sampled Lexicase Selection: Identifying Productive Training Cases for Efficient Problem Solving.","authors":"Ryan Boldi, Martin Briesch, Dominik Sobania, Alexander Lalejini, Thomas Helmuth, Franz Rothlauf, Charles Ofria, Lee Spector","doi":"10.1162/evco_a_00346","DOIUrl":"10.1162/evco_a_00346","url":null,"abstract":"<p><p>Genetic Programming (GP) often uses large training sets and requires all individuals to be evaluated on all training cases during selection. Random down-sampled lexicase selection evaluates individuals on only a random subset of the training cases, allowing for more individuals to be explored with the same number of program executions. However, sampling randomly can exclude important cases from the down-sample for a number of generations, while cases that measure the same behavior (synonymous cases) may be overused. In this work, we introduce Informed Down-Sampled Lexicase Selection. This method leverages population statistics to build down-samples that contain more distinct and therefore informative training cases. Through an empirical investigation across two different GP systems (PushGP and Grammar-Guided GP), we find that informed down-sampling significantly outperforms random down-sampling on a set of contemporary program synthesis benchmark problems. Through an analysis of the created down-samples, we find that important training cases are included in the down-sample consistently across independent evolutionary runs and systems. We hypothesize that this improvement can be attributed to the ability of Informed Down-Sampled Lexicase Selection to maintain more specialist individuals over the course of evolution, while still benefiting from reduced per-evaluation costs.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":" ","pages":"307-337"},"PeriodicalIF":4.6,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139562620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pablo Ramos Criado, D Barrios Rolanía, David de la Hoz, Daniel Manrique
{"title":"Estimation of Distribution Algorithm for Grammar-Guided Genetic Programming.","authors":"Pablo Ramos Criado, D Barrios Rolanía, David de la Hoz, Daniel Manrique","doi":"10.1162/evco_a_00345","DOIUrl":"10.1162/evco_a_00345","url":null,"abstract":"<p><p>Genetic variation operators in grammar-guided genetic programming are fundamental to guide the evolutionary process in search and optimization problems. However, they show some limitations, mainly derived from an unbalanced exploration and local-search trade-off. This paper presents an estimation of distribution algorithm for grammar-guided genetic programming to overcome this difficulty and thus increase the performance of the evolutionary algorithm. Our proposal employs an extended dynamic stochastic context-free grammar to encode and calculate the estimation of the distribution of the search space from some promising individuals in the population. Unlike traditional estimation of distribution algorithms, the proposed approach improves exploratory behavior by smoothing the estimated distribution model. Therefore, this algorithm is referred to as SEDA, smoothed estimation of distribution algorithm. Experiments have been conducted to compare overall performance using a typical genetic programming crossover operator, an incremental estimation of distribution algorithm, and the proposed approach after tuning their hyperparameters. These experiments involve challenging problems to test the local search and exploration features of the three evolutionary systems. The results show that grammar-guided genetic programming with SEDA achieves the most accurate solutions with an intermediate convergence speed.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":" ","pages":"339-370"},"PeriodicalIF":4.6,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139565374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Territorial Differential Meta-Evolution: An Algorithm for Seeking All the Desirable Optima of a Multivariable Function.","authors":"Richard Wehr, Scott R Saleska","doi":"10.1162/evco_a_00337","DOIUrl":"10.1162/evco_a_00337","url":null,"abstract":"<p><p>Territorial Differential Meta-Evolution (TDME) is an efficient, versatile, and reliable algorithm for seeking all the global or desirable local optima of a multivariable function. It employs a progressive niching mechanism to optimize even challenging, high-dimensional functions with multiple global optima and misleading local optima. This paper introduces TDME and uses standard and novel benchmark problems to quantify its advantages over HillVallEA, which is the best-performing algorithm on the standard benchmark suite that has been used by all major multimodal optimization competitions since 2013. TDME matches HillVallEA on that benchmark suite and categorically outperforms it on a more comprehensive suite that better reflects the potential diversity of optimization problems. TDME achieves that performance without any problem-specific parameter tuning.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":" ","pages":"399-426"},"PeriodicalIF":4.6,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9726877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chao Li, Jun Sun, Li-Wei Li, Min Shan, Vasile Palade, Xiaojun Wu
{"title":"Virtual Position Guided Strategy for Particle Swarm Optimization Algorithms on Multimodal Problems.","authors":"Chao Li, Jun Sun, Li-Wei Li, Min Shan, Vasile Palade, Xiaojun Wu","doi":"10.1162/evco_a_00352","DOIUrl":"10.1162/evco_a_00352","url":null,"abstract":"<p><p>Premature convergence is a thorny problem for particle swarm optimization (PSO) algorithms, especially on multimodal problems, where maintaining swarm diversity is crucial. However, most enhancement strategies for PSO, including the existing diversity-guided strategies, have not fully addressed this issue. This paper proposes the virtual position guided (VPG) strategy for PSO algorithms. The VPG strategy calculates diversity values for two different populations and establishes a diversity baseline. It then dynamically guides the algorithm to conduct different search behaviors, through three phases-divergence, normal, and acceleration-in each iteration, based on the relationships among these diversity values and the baseline. Collectively, these phases orchestrate different schemes to balance exploration and exploitation, collaboratively steering the algorithm away from local optima and towards enhanced solution quality. The introduction of \"virtual position\" caters to the strategy's adaptability across various PSO algorithms, ensuring the generality and effectiveness of the proposed VPG strategy. With a single hyperparameter and a recommended usual setup, VPG is easy to implement. The experimental results demonstrate that the VPG strategy is superior to several canonical and the state-of-the-art strategies for diversity guidance, and is effective in improving the search performance of most PSO algorithms on multimodal problems of various dimensionalities.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":" ","pages":"427-458"},"PeriodicalIF":4.6,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141082836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Arkadiy Dushatskiy, Marco Virgolin, Anton Bouter, Dirk Thierens, Peter A N Bosman
{"title":"Parameterless Gene-Pool Optimal Mixing Evolutionary Algorithms.","authors":"Arkadiy Dushatskiy, Marco Virgolin, Anton Bouter, Dirk Thierens, Peter A N Bosman","doi":"10.1162/evco_a_00338","DOIUrl":"10.1162/evco_a_00338","url":null,"abstract":"<p><p>When it comes to solving optimization problems with evolutionary algorithms (EAs) in a reliable and scalable manner, detecting and exploiting linkage information, that is, dependencies between variables, can be key. In this paper, we present the latest version of, and propose substantial enhancements to, the gene-pool optimal mixing evolutionary algorithm (GOMEA): an EA explicitly designed to estimate and exploit linkage information. We begin by performing a large-scale search over several GOMEA design choices to understand what matters most and obtain a generally best-performing version of the algorithm. Next, we introduce a novel version of GOMEA, called CGOMEA, where linkage-based variation is further improved by filtering solution mating based on conditional dependencies. We compare our latest version of GOMEA, the newly introduced CGOMEA, and another contending linkage-aware EA, DSMGA-II, in an extensive experimental evaluation, involving a benchmark set of nine black-box problems that can be solved efficiently only if their inherent dependency structure is unveiled and exploited. Finally, in an attempt to make EAs more usable and resilient to parameter choices, we investigate the performance of different automatic population management schemes for GOMEA and CGOMEA, de facto making the EAs parameterless. Our results show that GOMEA and CGOMEA significantly outperform the original GOMEA and DSMGA-II on most problems, setting a new state of the art for the field.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":" ","pages":"371-397"},"PeriodicalIF":4.6,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10104132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Genetic Programming-based Feature Selection for Symbolic Regression on Incomplete Data.","authors":"Baligh Al-Helali, Qi Chen, Bing Xue, Mengjie Zhang","doi":"10.1162/evco_a_00362","DOIUrl":"https://doi.org/10.1162/evco_a_00362","url":null,"abstract":"<p><p>High-dimensionality is one of the serious real-world data challenges in symbolic regression and it is more challenging if the data are incomplete. Genetic programming has been successfully utilised for high-dimensional tasks due to its natural feature selection ability, but it is not directly applicable to incomplete data. Commonly, it needs to impute the missing values first and then perform genetic programming on the imputed complete data. However, in the case of having many irrelevant features being incomplete, intuitively, it is not necessary to perform costly imputations on such features. For this purpose, this work proposes a genetic programming-based approach to select features directly from incomplete high-dimensional data to improve symbolic regression performance. We extend the concept of identity/neutral elements from mathematics into the function operators of genetic programming, thus they can handle the missing values in incomplete data. Experiments have been conducted on a number of data sets considering different missingness ratios in high-dimensional symbolic regression tasks. The results show that the proposed method leads to better symbolic regression results when compared with state-of-the-art methods that can select features directly from incomplete data. Further results show that our approach not only leads to better symbolic regression accuracy but also selects a smaller number of relevant features, and consequently improves both the effectiveness and the efficiency of the learning process.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":" ","pages":"1-27"},"PeriodicalIF":4.6,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142689431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tail Bounds on the Runtime of Categorical Compact Genetic Algorithm.","authors":"Ryoki Hamano, Kento Uchida, Shinichi Shirakawa, Daiki Morinaga, Youhei Akimoto","doi":"10.1162/evco_a_00361","DOIUrl":"https://doi.org/10.1162/evco_a_00361","url":null,"abstract":"<p><p>The majority of theoretical analyses of evolutionary algorithms in the discrete domain focus on binary optimization algorithms, even though black-box optimization on the categorical domain has a lot of practical applications. In this paper, we consider a probabilistic model-based algorithm using the family of categorical distributions as its underlying distribution and set the sample size as two. We term this specific algorithm the categorical compact genetic algorithm (ccGA). The ccGA can be considered as an extension of the compact genetic algorithm (cGA), which is an efficient binary optimization algorithm. We theoretically analyze the dependency of the number of possible categories K, the number of dimensions D, and the learning rate η on the runtime. We investigate the tail bound of the runtime on two typical linear functions on the categorical domain: categorical OneMax (COM) and KVAL. We derive that the runtimes on COM and KVAL are O(Dln(DK)/η) and Θ(DlnK/η) with high probability, respectively. Our analysis is a generalization for that of the cGA on the binary domain.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":" ","pages":"1-52"},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142367288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimizing Monotone Chance-Constrained Submodular Functions Using Evolutionary Multi-Objective Algorithms.","authors":"Aneta Neumann, Frank Neumann","doi":"10.1162/evco_a_00360","DOIUrl":"https://doi.org/10.1162/evco_a_00360","url":null,"abstract":"<p><p>Many real-world optimization problems can be stated in terms of submodular functions. Furthermore, these real-world problems often involve uncertainties which may lead to the violation of given constraints. A lot of evolutionary multi-objective algorithms following the Pareto optimization approach have recently been analyzed and applied to submodular problems with different types of constraints. We present a first runtime analysis of evolutionary multi-objective algorithms based on Pareto optimization for chance-constrained submodular functions. Here the constraint involves stochastic components and the constraint can only be violated with a small probability of α. We investigate the classical GSEMO algorithm for two different bi-objective formulations using tail bounds to determine the feasibility of solutions. We show that the algorithm GSEMO obtains the same worst case performance guarantees for monotone submodular functions as recently analyzed greedy algorithms for the case of uniform IID weights and uniformly distributed weights with the same dispersion when using the appropriate bi-objective formulation. As part of our investigations, we also point out situations where the use of tail bounds in the first bi-objective formulation can prevent GSEMO from obtaining good solutions in the case of uniformly distributed weights with the same dispersion if the objective function is submodular but non-monotone due to a single element impacting monotonicity. Furthermore, we investigate the behavior of the evolutionary multi-objective algorithms GSEMO, NSGA-II and SPEA2 on different submodular chance-constrained network problems. Our experimental results show that the use of evolutionary multi-objective algorithms leads to significant performance improvements compared to state-of-the-art greedy algorithms for submodular optimization.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":" ","pages":"1-35"},"PeriodicalIF":4.6,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142331627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Genetic Programming for Automatically Evolving Multiple Features to Classification.","authors":"Peng Wang, Bing Xue, Jing Liang, Mengjie Zhang","doi":"10.1162/evco_a_00359","DOIUrl":"https://doi.org/10.1162/evco_a_00359","url":null,"abstract":"<p><p>Performing classification on high-dimensional data poses a significant challenge due to the huge search space. Moreover, complex feature interactions introduce an additional obstacle. The problems can be addressed by using feature selection to select relevant features or feature construction to construct a small set of high-level features. However, performing feature selection or feature construction only might make the feature set suboptimal. To remedy this problem, this study investigates the use of genetic programming for simultaneous feature selection and feature construction in addressing different classification tasks. The proposed approach is tested on 16 datasets and compared with seven methods including both feature selection and feature constructions techniques. The results show that the obtained feature sets with the constructed and/or selected features can significantly increase the classification accuracy and reduce the dimensionality of the datasets. Further analysis reveals the complementarity of the obtained features leading to the promising classification performance of the proposed method.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":" ","pages":"1-27"},"PeriodicalIF":4.6,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142299919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Giuseppe Paolo, Miranda Coninx, Alban Laflaquière, Stephane Doncieux
{"title":"Discovering and Exploiting Sparse Rewards in a Learned Behavior Space.","authors":"Giuseppe Paolo, Miranda Coninx, Alban Laflaquière, Stephane Doncieux","doi":"10.1162/evco_a_00343","DOIUrl":"10.1162/evco_a_00343","url":null,"abstract":"<p><p>Learning optimal policies in sparse rewards settings is difficult as the learning agent has little to no feedback on the quality of its actions. In these situations, a good strategy is to focus on exploration, hopefully leading to the discovery of a reward signal to improve on. A learning algorithm capable of dealing with this kind of setting has to be able to (1) explore possible agent behaviors and (2) exploit any possible discovered reward. Exploration algorithms have been proposed that require the definition of a low-dimension behavior space, in which the behavior generated by the agent's policy can be represented. The need to design a priori this space such that it is worth exploring is a major limitation of these algorithms. In this work, we introduce STAX, an algorithm designed to learn a behavior space on-the-fly and to explore it while optimizing any reward discovered (see Figure 1). It does so by separating the exploration and learning of the behavior space from the exploitation of the reward through an alternating two-step process. In the first step, STAX builds a repertoire of diverse policies while learning a low-dimensional representation of the high-dimensional observations generated during the policies evaluation. In the exploitation step, emitters optimize the performance of the discovered rewarding solutions. Experiments conducted on three different sparse reward environments show that STAX performs comparably to existing baselines while requiring much less prior information about the task as it autonomously builds the behavior space it explores.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":" ","pages":"275-305"},"PeriodicalIF":4.6,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41171496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}