Jingchi Jiang , Rujia Shen , Chao Zhao , Yi Guan , Xuehui Yu , Xuelian Fu
{"title":"Causal discovery based on hierarchical reinforcement learning","authors":"Jingchi Jiang , Rujia Shen , Chao Zhao , Yi Guan , Xuehui Yu , Xuelian Fu","doi":"10.1016/j.eswa.2025.127466","DOIUrl":null,"url":null,"abstract":"<div><div>Conditional independence (CI) tests in causal discovery can determine a set of Markov equivalence classes w.r.t. the observed data by checking whether each pair of variables is d-separated under faithfulness and Markov assumptions. However, CI tests are intractable for high-dimensional conditional variables. Motivated by the advantages of reinforcement learning in exploring the solution space, firstly, we propose a causal discovery framework based on hierarchical reinforcement learning (CD-HRL). This framework trains both the discovery of the causal skeleton and the identification of direction using two interdependent high-level and low-level policies seperately. Dividing causal discovery into two distinct subtasks to high-level and low-level policies enhances exploration efficiency and minimises error accumulation. The high-level policy iteratively generates causal skeletons as subgoals for instructing the low-level policy, which then identifies causal directions of individual pairs of variables. Secondly, to avoid redundant exploration of familiar causal structures, we incorporate a memory module into the high-level agent and predefine an augmented reward that combines a causal score function and a curiosity item for exploring unknown causal structures. Lastly, experiments on both synthetic and real datasets show that the proposed approach outperforms the state-of-the-art methods under various data-generating procedures, which follow linear, nonlinear, and ordinary differential equations with additive Gaussian noise. The code for our CD-HRL method is available online in <span><span>https://github.com/HITshenrj/CD-HRL</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"279 ","pages":"Article 127466"},"PeriodicalIF":7.5000,"publicationDate":"2025-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425010887","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Conditional independence (CI) tests in causal discovery can determine a set of Markov equivalence classes w.r.t. the observed data by checking whether each pair of variables is d-separated under faithfulness and Markov assumptions. However, CI tests are intractable for high-dimensional conditional variables. Motivated by the advantages of reinforcement learning in exploring the solution space, firstly, we propose a causal discovery framework based on hierarchical reinforcement learning (CD-HRL). This framework trains both the discovery of the causal skeleton and the identification of direction using two interdependent high-level and low-level policies seperately. Dividing causal discovery into two distinct subtasks to high-level and low-level policies enhances exploration efficiency and minimises error accumulation. The high-level policy iteratively generates causal skeletons as subgoals for instructing the low-level policy, which then identifies causal directions of individual pairs of variables. Secondly, to avoid redundant exploration of familiar causal structures, we incorporate a memory module into the high-level agent and predefine an augmented reward that combines a causal score function and a curiosity item for exploring unknown causal structures. Lastly, experiments on both synthetic and real datasets show that the proposed approach outperforms the state-of-the-art methods under various data-generating procedures, which follow linear, nonlinear, and ordinary differential equations with additive Gaussian noise. The code for our CD-HRL method is available online in https://github.com/HITshenrj/CD-HRL.
期刊介绍:
Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.