{"title":"迈向智能获取策略:因果多目标贝叶斯优化的强化学习","authors":"Shikun Chen, Yangguang Liu","doi":"10.1016/j.swevo.2026.102290","DOIUrl":null,"url":null,"abstract":"<div><div>Traditional Bayesian optimization relies on hand-crafted acquisition functions that treat variables uniformly, whereas multi-objective systems contain exploitable causal structure. Although recent advances established foundations for causal Bayesian optimization and multi-objective reinforcement learning independently, no existing approach combines these paradigms. Static acquisition policies cannot adapt to causal dependencies and competing objective trade-offs. Here, we present RL-CMBO (Reinforcement Learning for Causal Multi-Objective Bayesian Optimization), a reinforcement learning framework that learns intelligent acquisition policies through experience. Our approach shifts from hand-crafted functions to learned policies capable of discovering optimal intervention strategies. The framework integrates: (1) a meta-learning architecture adapting to task-specific causal structures, (2) state representation encoding Pareto front features and causal graph topology, and (3) constrained action space over Possibly-Optimal Minimal Intervention Sets (POMIS) ensuring causal identifiability. Our reward engineering balances hypervolume improvement, intervention costs, and Pareto front diversity–thus addressing competing objectives in causal systems. Experimental evaluation demonstrates that RL-CMBO outperforms existing methods across synthetic and real-world benchmarks, achieving improved sample efficiency while discovering intervention patterns that static acquisition functions fail to identify. This work establishes the first unified framework combining reinforcement learning, causal reasoning, and multi-objective optimization.</div></div>","PeriodicalId":48682,"journal":{"name":"Swarm and Evolutionary Computation","volume":"101 ","pages":"Article 102290"},"PeriodicalIF":8.5000,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Toward intelligent acquisition policies: Reinforcement learning for causal multi-objective Bayesian optimization\",\"authors\":\"Shikun Chen, Yangguang Liu\",\"doi\":\"10.1016/j.swevo.2026.102290\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Traditional Bayesian optimization relies on hand-crafted acquisition functions that treat variables uniformly, whereas multi-objective systems contain exploitable causal structure. Although recent advances established foundations for causal Bayesian optimization and multi-objective reinforcement learning independently, no existing approach combines these paradigms. Static acquisition policies cannot adapt to causal dependencies and competing objective trade-offs. Here, we present RL-CMBO (Reinforcement Learning for Causal Multi-Objective Bayesian Optimization), a reinforcement learning framework that learns intelligent acquisition policies through experience. Our approach shifts from hand-crafted functions to learned policies capable of discovering optimal intervention strategies. The framework integrates: (1) a meta-learning architecture adapting to task-specific causal structures, (2) state representation encoding Pareto front features and causal graph topology, and (3) constrained action space over Possibly-Optimal Minimal Intervention Sets (POMIS) ensuring causal identifiability. Our reward engineering balances hypervolume improvement, intervention costs, and Pareto front diversity–thus addressing competing objectives in causal systems. Experimental evaluation demonstrates that RL-CMBO outperforms existing methods across synthetic and real-world benchmarks, achieving improved sample efficiency while discovering intervention patterns that static acquisition functions fail to identify. This work establishes the first unified framework combining reinforcement learning, causal reasoning, and multi-objective optimization.</div></div>\",\"PeriodicalId\":48682,\"journal\":{\"name\":\"Swarm and Evolutionary Computation\",\"volume\":\"101 \",\"pages\":\"Article 102290\"},\"PeriodicalIF\":8.5000,\"publicationDate\":\"2026-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Swarm and Evolutionary Computation\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2210650226000106\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2026/2/4 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Swarm and Evolutionary Computation","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2210650226000106","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/2/4 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Traditional Bayesian optimization relies on hand-crafted acquisition functions that treat variables uniformly, whereas multi-objective systems contain exploitable causal structure. Although recent advances established foundations for causal Bayesian optimization and multi-objective reinforcement learning independently, no existing approach combines these paradigms. Static acquisition policies cannot adapt to causal dependencies and competing objective trade-offs. Here, we present RL-CMBO (Reinforcement Learning for Causal Multi-Objective Bayesian Optimization), a reinforcement learning framework that learns intelligent acquisition policies through experience. Our approach shifts from hand-crafted functions to learned policies capable of discovering optimal intervention strategies. The framework integrates: (1) a meta-learning architecture adapting to task-specific causal structures, (2) state representation encoding Pareto front features and causal graph topology, and (3) constrained action space over Possibly-Optimal Minimal Intervention Sets (POMIS) ensuring causal identifiability. Our reward engineering balances hypervolume improvement, intervention costs, and Pareto front diversity–thus addressing competing objectives in causal systems. Experimental evaluation demonstrates that RL-CMBO outperforms existing methods across synthetic and real-world benchmarks, achieving improved sample efficiency while discovering intervention patterns that static acquisition functions fail to identify. This work establishes the first unified framework combining reinforcement learning, causal reasoning, and multi-objective optimization.
期刊介绍:
Swarm and Evolutionary Computation is a pioneering peer-reviewed journal focused on the latest research and advancements in nature-inspired intelligent computation using swarm and evolutionary algorithms. It covers theoretical, experimental, and practical aspects of these paradigms and their hybrids, promoting interdisciplinary research. The journal prioritizes the publication of high-quality, original articles that push the boundaries of evolutionary computation and swarm intelligence. Additionally, it welcomes survey papers on current topics and novel applications. Topics of interest include but are not limited to: Genetic Algorithms, and Genetic Programming, Evolution Strategies, and Evolutionary Programming, Differential Evolution, Artificial Immune Systems, Particle Swarms, Ant Colony, Bacterial Foraging, Artificial Bees, Fireflies Algorithm, Harmony Search, Artificial Life, Digital Organisms, Estimation of Distribution Algorithms, Stochastic Diffusion Search, Quantum Computing, Nano Computing, Membrane Computing, Human-centric Computing, Hybridization of Algorithms, Memetic Computing, Autonomic Computing, Self-organizing systems, Combinatorial, Discrete, Binary, Constrained, Multi-objective, Multi-modal, Dynamic, and Large-scale Optimization.