{"title":"针对高维演化学习和地震烈度衰减关系的改进主动学习和库压缩","authors":"Quanhong Li, Wang Hu, Yu Zhang","doi":"10.1016/j.eswa.2025.129358","DOIUrl":null,"url":null,"abstract":"<div><div>Evolutionary learning algorithms offer powerful frameworks for addressing complex modeling and optimization tasks in domains ranging from engineering design to data analysis. Among these, Genetic Programming (GP) and its variants have proven effective for symbolic regression, automatically discovering mathematical expressions that fit data. In particular, Semantic GP (SGP) improves search by operating on program behaviors (semantics) rather than purely syntactic representations, leading to performance gains over standard GP. However, SGP faces the challenge of an intrinsically high-dimensional semantic space — each training point contributes one dimension — which exacerbates the curse of dimensionality and complicates the search. This paper proposes a novel approach to address this issue: High-Dimensional Efficient Semantic Genetic Programming (HDE-SGP). HDE-SGP integrates three key strategies to reduce semantic complexity: an active sampling mechanism that iteratively selects informative subsets of training data to focus the search (mitigating semantic dimensionality per generation), a semantic library compression method using OPTICS clustering to retain only diverse, representative solutions from the evolving population, and a rank-based semantic similarity measure (Spearman’s rank correlation) to guide crossover and selection toward behaviorally novel offspring. Experimental results on 16 benchmark symbolic regression datasets—covering both synthetic and real-world problems—show that, with almost no loss in predictive accuracy, HDE-SGP reduces average runtime by approximately 50 % and decreases evolved program size by nearly 50 % compared to conventional GP and SGP baselines. Moreover, a case study on the Seismic Intensity Attenuation Relationship further validates the effectiveness of the method.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"297 ","pages":"Article 129358"},"PeriodicalIF":7.5000,"publicationDate":"2025-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhanced active learning and library compression for high-dimensional evolutionary learning and seismic intensity attenuation relationship case\",\"authors\":\"Quanhong Li, Wang Hu, Yu Zhang\",\"doi\":\"10.1016/j.eswa.2025.129358\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Evolutionary learning algorithms offer powerful frameworks for addressing complex modeling and optimization tasks in domains ranging from engineering design to data analysis. Among these, Genetic Programming (GP) and its variants have proven effective for symbolic regression, automatically discovering mathematical expressions that fit data. In particular, Semantic GP (SGP) improves search by operating on program behaviors (semantics) rather than purely syntactic representations, leading to performance gains over standard GP. However, SGP faces the challenge of an intrinsically high-dimensional semantic space — each training point contributes one dimension — which exacerbates the curse of dimensionality and complicates the search. This paper proposes a novel approach to address this issue: High-Dimensional Efficient Semantic Genetic Programming (HDE-SGP). HDE-SGP integrates three key strategies to reduce semantic complexity: an active sampling mechanism that iteratively selects informative subsets of training data to focus the search (mitigating semantic dimensionality per generation), a semantic library compression method using OPTICS clustering to retain only diverse, representative solutions from the evolving population, and a rank-based semantic similarity measure (Spearman’s rank correlation) to guide crossover and selection toward behaviorally novel offspring. Experimental results on 16 benchmark symbolic regression datasets—covering both synthetic and real-world problems—show that, with almost no loss in predictive accuracy, HDE-SGP reduces average runtime by approximately 50 % and decreases evolved program size by nearly 50 % compared to conventional GP and SGP baselines. Moreover, a case study on the Seismic Intensity Attenuation Relationship further validates the effectiveness of the method.</div></div>\",\"PeriodicalId\":50461,\"journal\":{\"name\":\"Expert Systems with Applications\",\"volume\":\"297 \",\"pages\":\"Article 129358\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2025-08-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Expert Systems with Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0957417425029732\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425029732","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Enhanced active learning and library compression for high-dimensional evolutionary learning and seismic intensity attenuation relationship case
Evolutionary learning algorithms offer powerful frameworks for addressing complex modeling and optimization tasks in domains ranging from engineering design to data analysis. Among these, Genetic Programming (GP) and its variants have proven effective for symbolic regression, automatically discovering mathematical expressions that fit data. In particular, Semantic GP (SGP) improves search by operating on program behaviors (semantics) rather than purely syntactic representations, leading to performance gains over standard GP. However, SGP faces the challenge of an intrinsically high-dimensional semantic space — each training point contributes one dimension — which exacerbates the curse of dimensionality and complicates the search. This paper proposes a novel approach to address this issue: High-Dimensional Efficient Semantic Genetic Programming (HDE-SGP). HDE-SGP integrates three key strategies to reduce semantic complexity: an active sampling mechanism that iteratively selects informative subsets of training data to focus the search (mitigating semantic dimensionality per generation), a semantic library compression method using OPTICS clustering to retain only diverse, representative solutions from the evolving population, and a rank-based semantic similarity measure (Spearman’s rank correlation) to guide crossover and selection toward behaviorally novel offspring. Experimental results on 16 benchmark symbolic regression datasets—covering both synthetic and real-world problems—show that, with almost no loss in predictive accuracy, HDE-SGP reduces average runtime by approximately 50 % and decreases evolved program size by nearly 50 % compared to conventional GP and SGP baselines. Moreover, a case study on the Seismic Intensity Attenuation Relationship further validates the effectiveness of the method.
期刊介绍:
Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.