迈向智能获取策略：因果多目标贝叶斯优化的强化学习

IF 8.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Swarm and Evolutionary Computation Pub Date : 2026-02-01 Epub Date: 2026-02-04 DOI:10.1016/j.swevo.2026.102290

Shikun Chen, Yangguang Liu

{"title":"迈向智能获取策略：因果多目标贝叶斯优化的强化学习","authors":"Shikun Chen, Yangguang Liu","doi":"10.1016/j.swevo.2026.102290","DOIUrl":null,"url":null,"abstract":"<div><div>Traditional Bayesian optimization relies on hand-crafted acquisition functions that treat variables uniformly, whereas multi-objective systems contain exploitable causal structure. Although recent advances established foundations for causal Bayesian optimization and multi-objective reinforcement learning independently, no existing approach combines these paradigms. Static acquisition policies cannot adapt to causal dependencies and competing objective trade-offs. Here, we present RL-CMBO (Reinforcement Learning for Causal Multi-Objective Bayesian Optimization), a reinforcement learning framework that learns intelligent acquisition policies through experience. Our approach shifts from hand-crafted functions to learned policies capable of discovering optimal intervention strategies. The framework integrates: (1) a meta-learning architecture adapting to task-specific causal structures, (2) state representation encoding Pareto front features and causal graph topology, and (3) constrained action space over Possibly-Optimal Minimal Intervention Sets (POMIS) ensuring causal identifiability. Our reward engineering balances hypervolume improvement, intervention costs, and Pareto front diversity–thus addressing competing objectives in causal systems. Experimental evaluation demonstrates that RL-CMBO outperforms existing methods across synthetic and real-world benchmarks, achieving improved sample efficiency while discovering intervention patterns that static acquisition functions fail to identify. This work establishes the first unified framework combining reinforcement learning, causal reasoning, and multi-objective optimization.</div></div>","PeriodicalId":48682,"journal":{"name":"Swarm and Evolutionary Computation","volume":"101 ","pages":"Article 102290"},"PeriodicalIF":8.5000,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Toward intelligent acquisition policies: Reinforcement learning for causal multi-objective Bayesian optimization\",\"authors\":\"Shikun Chen, Yangguang Liu\",\"doi\":\"10.1016/j.swevo.2026.102290\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Traditional Bayesian optimization relies on hand-crafted acquisition functions that treat variables uniformly, whereas multi-objective systems contain exploitable causal structure. Although recent advances established foundations for causal Bayesian optimization and multi-objective reinforcement learning independently, no existing approach combines these paradigms. Static acquisition policies cannot adapt to causal dependencies and competing objective trade-offs. Here, we present RL-CMBO (Reinforcement Learning for Causal Multi-Objective Bayesian Optimization), a reinforcement learning framework that learns intelligent acquisition policies through experience. Our approach shifts from hand-crafted functions to learned policies capable of discovering optimal intervention strategies. The framework integrates: (1) a meta-learning architecture adapting to task-specific causal structures, (2) state representation encoding Pareto front features and causal graph topology, and (3) constrained action space over Possibly-Optimal Minimal Intervention Sets (POMIS) ensuring causal identifiability. Our reward engineering balances hypervolume improvement, intervention costs, and Pareto front diversity–thus addressing competing objectives in causal systems. Experimental evaluation demonstrates that RL-CMBO outperforms existing methods across synthetic and real-world benchmarks, achieving improved sample efficiency while discovering intervention patterns that static acquisition functions fail to identify. This work establishes the first unified framework combining reinforcement learning, causal reasoning, and multi-objective optimization.</div></div>\",\"PeriodicalId\":48682,\"journal\":{\"name\":\"Swarm and Evolutionary Computation\",\"volume\":\"101 \",\"pages\":\"Article 102290\"},\"PeriodicalIF\":8.5000,\"publicationDate\":\"2026-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Swarm and Evolutionary Computation\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2210650226000106\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2026/2/4 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Swarm and Evolutionary Computation","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2210650226000106","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/2/4 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

传统的贝叶斯优化依赖于手工制作的获取函数，该函数统一处理变量，而多目标系统包含可利用的因果结构。尽管最近的进展为因果贝叶斯优化和多目标强化学习建立了独立的基础，但没有现有的方法将这些范式结合起来。静态获取策略不能适应因果依赖和竞争的客观权衡。在这里，我们提出了RL-CMBO（因果多目标贝叶斯优化的强化学习），这是一个通过经验学习智能获取策略的强化学习框架。我们的方法从手工制作的功能转变为能够发现最佳干预策略的学习策略。该框架集成了：(1)适应任务特定因果结构的元学习架构，(2)编码帕累托前特征和因果图拓扑的状态表示，以及(3)基于可能最优最小干预集（POMIS）的约束动作空间，确保因果可识别性。我们的奖励工程平衡了超大容量改进、干预成本和帕累托前沿多样性，从而解决了因果系统中相互竞争的目标。实验评估表明，RL-CMBO在合成基准和现实基准方面优于现有方法，在发现静态采集函数无法识别的干预模式的同时，提高了样本效率。这项工作建立了第一个结合强化学习、因果推理和多目标优化的统一框架。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Toward intelligent acquisition policies: Reinforcement learning for causal multi-objective Bayesian optimization

Traditional Bayesian optimization relies on hand-crafted acquisition functions that treat variables uniformly, whereas multi-objective systems contain exploitable causal structure. Although recent advances established foundations for causal Bayesian optimization and multi-objective reinforcement learning independently, no existing approach combines these paradigms. Static acquisition policies cannot adapt to causal dependencies and competing objective trade-offs. Here, we present RL-CMBO (Reinforcement Learning for Causal Multi-Objective Bayesian Optimization), a reinforcement learning framework that learns intelligent acquisition policies through experience. Our approach shifts from hand-crafted functions to learned policies capable of discovering optimal intervention strategies. The framework integrates: (1) a meta-learning architecture adapting to task-specific causal structures, (2) state representation encoding Pareto front features and causal graph topology, and (3) constrained action space over Possibly-Optimal Minimal Intervention Sets (POMIS) ensuring causal identifiability. Our reward engineering balances hypervolume improvement, intervention costs, and Pareto front diversity–thus addressing competing objectives in causal systems. Experimental evaluation demonstrates that RL-CMBO outperforms existing methods across synthetic and real-world benchmarks, achieving improved sample efficiency while discovering intervention patterns that static acquisition functions fail to identify. This work establishes the first unified framework combining reinforcement learning, causal reasoning, and multi-objective optimization.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Swarm and Evolutionary Computation COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCEC-COMPUTER SCIENCE, THEORY & METHODS

CiteScore

16.00

自引率

12.00%

发文量

169

期刊介绍： Swarm and Evolutionary Computation is a pioneering peer-reviewed journal focused on the latest research and advancements in nature-inspired intelligent computation using swarm and evolutionary algorithms. It covers theoretical, experimental, and practical aspects of these paradigms and their hybrids, promoting interdisciplinary research. The journal prioritizes the publication of high-quality, original articles that push the boundaries of evolutionary computation and swarm intelligence. Additionally, it welcomes survey papers on current topics and novel applications. Topics of interest include but are not limited to: Genetic Algorithms, and Genetic Programming, Evolution Strategies, and Evolutionary Programming, Differential Evolution, Artificial Immune Systems, Particle Swarms, Ant Colony, Bacterial Foraging, Artificial Bees, Fireflies Algorithm, Harmony Search, Artificial Life, Digital Organisms, Estimation of Distribution Algorithms, Stochastic Diffusion Search, Quantum Computing, Nano Computing, Membrane Computing, Human-centric Computing, Hybridization of Algorithms, Memetic Computing, Autonomic Computing, Self-organizing systems, Combinatorial, Discrete, Binary, Constrained, Multi-objective, Multi-modal, Dynamic, and Large-scale Optimization.