针对高维演化学习和地震烈度衰减关系的改进主动学习和库压缩

IF 7.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Expert Systems with Applications Pub Date : 2025-08-22 DOI:10.1016/j.eswa.2025.129358

Quanhong Li, Wang Hu, Yu Zhang

{"title":"针对高维演化学习和地震烈度衰减关系的改进主动学习和库压缩","authors":"Quanhong Li, Wang Hu, Yu Zhang","doi":"10.1016/j.eswa.2025.129358","DOIUrl":null,"url":null,"abstract":"<div><div>Evolutionary learning algorithms offer powerful frameworks for addressing complex modeling and optimization tasks in domains ranging from engineering design to data analysis. Among these, Genetic Programming (GP) and its variants have proven effective for symbolic regression, automatically discovering mathematical expressions that fit data. In particular, Semantic GP (SGP) improves search by operating on program behaviors (semantics) rather than purely syntactic representations, leading to performance gains over standard GP. However, SGP faces the challenge of an intrinsically high-dimensional semantic space — each training point contributes one dimension — which exacerbates the curse of dimensionality and complicates the search. This paper proposes a novel approach to address this issue: High-Dimensional Efficient Semantic Genetic Programming (HDE-SGP). HDE-SGP integrates three key strategies to reduce semantic complexity: an active sampling mechanism that iteratively selects informative subsets of training data to focus the search (mitigating semantic dimensionality per generation), a semantic library compression method using OPTICS clustering to retain only diverse, representative solutions from the evolving population, and a rank-based semantic similarity measure (Spearman’s rank correlation) to guide crossover and selection toward behaviorally novel offspring. Experimental results on 16 benchmark symbolic regression datasets—covering both synthetic and real-world problems—show that, with almost no loss in predictive accuracy, HDE-SGP reduces average runtime by approximately 50 % and decreases evolved program size by nearly 50 % compared to conventional GP and SGP baselines. Moreover, a case study on the Seismic Intensity Attenuation Relationship further validates the effectiveness of the method.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"297 ","pages":"Article 129358"},"PeriodicalIF":7.5000,"publicationDate":"2025-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhanced active learning and library compression for high-dimensional evolutionary learning and seismic intensity attenuation relationship case\",\"authors\":\"Quanhong Li, Wang Hu, Yu Zhang\",\"doi\":\"10.1016/j.eswa.2025.129358\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Evolutionary learning algorithms offer powerful frameworks for addressing complex modeling and optimization tasks in domains ranging from engineering design to data analysis. Among these, Genetic Programming (GP) and its variants have proven effective for symbolic regression, automatically discovering mathematical expressions that fit data. In particular, Semantic GP (SGP) improves search by operating on program behaviors (semantics) rather than purely syntactic representations, leading to performance gains over standard GP. However, SGP faces the challenge of an intrinsically high-dimensional semantic space — each training point contributes one dimension — which exacerbates the curse of dimensionality and complicates the search. This paper proposes a novel approach to address this issue: High-Dimensional Efficient Semantic Genetic Programming (HDE-SGP). HDE-SGP integrates three key strategies to reduce semantic complexity: an active sampling mechanism that iteratively selects informative subsets of training data to focus the search (mitigating semantic dimensionality per generation), a semantic library compression method using OPTICS clustering to retain only diverse, representative solutions from the evolving population, and a rank-based semantic similarity measure (Spearman’s rank correlation) to guide crossover and selection toward behaviorally novel offspring. Experimental results on 16 benchmark symbolic regression datasets—covering both synthetic and real-world problems—show that, with almost no loss in predictive accuracy, HDE-SGP reduces average runtime by approximately 50 % and decreases evolved program size by nearly 50 % compared to conventional GP and SGP baselines. Moreover, a case study on the Seismic Intensity Attenuation Relationship further validates the effectiveness of the method.</div></div>\",\"PeriodicalId\":50461,\"journal\":{\"name\":\"Expert Systems with Applications\",\"volume\":\"297 \",\"pages\":\"Article 129358\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2025-08-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Expert Systems with Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0957417425029732\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425029732","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

进化学习算法为解决从工程设计到数据分析等领域的复杂建模和优化任务提供了强大的框架。其中，遗传规划（GP）及其变体已被证明是有效的符号回归，自动发现适合数据的数学表达式。特别是，语义GP （Semantic GP）通过操作程序行为（语义）而不是纯粹的语法表示来改进搜索，从而获得比标准GP更高的性能。然而，SGP面临着一个本质上高维的语义空间的挑战，每个训练点贡献一个维度，这加剧了维度的诅咒，使搜索变得复杂。本文提出了一种新的方法来解决这个问题：高维高效语义遗传规划（HDE-SGP）。HDE-SGP集成了三个关键策略来降低语义复杂性：主动采样机制，迭代选择训练数据的信息子集来集中搜索（减少每一代的语义维度），使用OPTICS聚类的语义库压缩方法，仅保留进化群体中具有代表性的多样化解决方案，以及基于等级的语义相似性度量（Spearman等级相关），以指导交叉和选择行为新颖的后代。在16个基准符号回归数据集（包括合成问题和现实问题）上的实验结果表明，与传统GP和SGP基线相比，HDE-SGP在预测精度几乎没有损失的情况下，平均运行时间减少了约50%，进化程序大小减少了近50%。以地震烈度衰减关系为例，进一步验证了该方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Enhanced active learning and library compression for high-dimensional evolutionary learning and seismic intensity attenuation relationship case

Evolutionary learning algorithms offer powerful frameworks for addressing complex modeling and optimization tasks in domains ranging from engineering design to data analysis. Among these, Genetic Programming (GP) and its variants have proven effective for symbolic regression, automatically discovering mathematical expressions that fit data. In particular, Semantic GP (SGP) improves search by operating on program behaviors (semantics) rather than purely syntactic representations, leading to performance gains over standard GP. However, SGP faces the challenge of an intrinsically high-dimensional semantic space — each training point contributes one dimension — which exacerbates the curse of dimensionality and complicates the search. This paper proposes a novel approach to address this issue: High-Dimensional Efficient Semantic Genetic Programming (HDE-SGP). HDE-SGP integrates three key strategies to reduce semantic complexity: an active sampling mechanism that iteratively selects informative subsets of training data to focus the search (mitigating semantic dimensionality per generation), a semantic library compression method using OPTICS clustering to retain only diverse, representative solutions from the evolving population, and a rank-based semantic similarity measure (Spearman’s rank correlation) to guide crossover and selection toward behaviorally novel offspring. Experimental results on 16 benchmark symbolic regression datasets—covering both synthetic and real-world problems—show that, with almost no loss in predictive accuracy, HDE-SGP reduces average runtime by approximately 50 % and decreases evolved program size by nearly 50 % compared to conventional GP and SGP baselines. Moreover, a case study on the Seismic Intensity Attenuation Relationship further validates the effectiveness of the method.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Expert Systems with Applications 工程技术-工程：电子与电气

CiteScore

13.80

自引率

10.60%

发文量

2045

审稿时长

8.7 months

期刊介绍： Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.