CaMo: Capturing the modularity by end-to-end models for Symbolic Regression

IF 7.2 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Knowledge-Based Systems Pub Date : 2024-11-22 DOI:10.1016/j.knosys.2024.112747

Jingyi Liu , Min Wu , Lina Yu , Weijun Li , Wenqiang Li , Yanjie Li , Meilan Hao , Yusong Deng , Shu Wei

{"title":"CaMo: Capturing the modularity by end-to-end models for Symbolic Regression","authors":"Jingyi Liu , Min Wu , Lina Yu , Weijun Li , Wenqiang Li , Yanjie Li , Meilan Hao , Yusong Deng , Shu Wei","doi":"10.1016/j.knosys.2024.112747","DOIUrl":null,"url":null,"abstract":"<div><div>Modularity is a ubiquitous principle that permeates various aspects of nature, society, and human endeavors, from biological systems to organizational structures and beyond. In the context of Symbolic Regression, which aims to find the explicit expressions from observed data, modularity could be viewed as a type of knowledge to capture the salient substructure to achieve higher fitting results. Symbolic Regression is essentially a composition optimization problem thus remaining valuable sub-structures can provide efficiency to the subsequent search. In this paper, we propose to acquire modularity in a search process and use the term <em>module</em> indicating the useful sub-structure. Specifically, the end-to-end model is chosen to incorporate the module into the search procedure for its scalability and generalization ability. Modules are considered high-order knowledge and act as fundamental operators, expanding the search library of Symbolic Regression. The proposed algorithm enables self-learning or self-evolution of modules as part of the learning component. Additionally, a module extraction strategy generates modules hierarchically from the expression tree, along with a module update mechanism designed to eliminate unnecessary modules while incorporating new useful ones effectively. Experiments were conducted to evaluate the effectiveness of each component.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"309 ","pages":"Article 112747"},"PeriodicalIF":7.2000,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705124013819","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Modularity is a ubiquitous principle that permeates various aspects of nature, society, and human endeavors, from biological systems to organizational structures and beyond. In the context of Symbolic Regression, which aims to find the explicit expressions from observed data, modularity could be viewed as a type of knowledge to capture the salient substructure to achieve higher fitting results. Symbolic Regression is essentially a composition optimization problem thus remaining valuable sub-structures can provide efficiency to the subsequent search. In this paper, we propose to acquire modularity in a search process and use the term module indicating the useful sub-structure. Specifically, the end-to-end model is chosen to incorporate the module into the search procedure for its scalability and generalization ability. Modules are considered high-order knowledge and act as fundamental operators, expanding the search library of Symbolic Regression. The proposed algorithm enables self-learning or self-evolution of modules as part of the learning component. Additionally, a module extraction strategy generates modules hierarchically from the expression tree, along with a module update mechanism designed to eliminate unnecessary modules while incorporating new useful ones effectively. Experiments were conducted to evaluate the effectiveness of each component.

查看原文本刊更多论文

迷彩：通过符号回归的端到端模型捕获模块化

模块化是一种无处不在的原则，它渗透到自然、社会和人类活动的各个方面，从生物系统到组织结构等等。在旨在从观测数据中找到显式表达式的符号回归的背景下，模块化可以被视为一种捕获显著子结构以获得更高拟合结果的知识。符号回归本质上是一个组合优化问题，因此保留有价值的子结构可以为后续搜索提供效率。在本文中，我们提出在搜索过程中获得模块化，并使用术语模块来表示有用的子结构。具体而言，选择端到端模型将模块整合到搜索过程中，以提高其可扩展性和泛化能力。模块被认为是高阶知识，作为基本运算符，扩展了符号回归的搜索库。该算法使模块能够作为学习组件的一部分进行自学习或自进化。此外，模块提取策略从表达式树中分层地生成模块，以及模块更新机制，该机制旨在消除不必要的模块，同时有效地合并新的有用模块。通过实验对各成分的有效性进行了评价。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Knowledge-Based Systems 工程技术-计算机：人工智能

CiteScore

14.80

自引率

12.50%

发文量

1245

审稿时长

7.8 months

期刊介绍： Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.