LLM-Driven Causal Discovery via Harmonized Prior

IF 8.9 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-01-13 DOI:10.1109/TKDE.2025.3528461

Taiyu Ban;Lyuzhou Chen;Derui Lyu;Xiangyu Wang;Qinrui Zhu;Huanhuan Chen

{"title":"LLM-Driven Causal Discovery via Harmonized Prior","authors":"Taiyu Ban;Lyuzhou Chen;Derui Lyu;Xiangyu Wang;Qinrui Zhu;Huanhuan Chen","doi":"10.1109/TKDE.2025.3528461","DOIUrl":null,"url":null,"abstract":"Traditional domain-specific causal discovery relies on expert knowledge to guide the data-based structure learning process, thereby improving the reliability of recovered causality. Recent studies have shown promise in using the Large Language Model (LLM) as causal experts to construct autonomous expert-guided causal discovery systems through causal reasoning between pairwise variables. However, their performance is hampered by inaccuracies in aligning LLM-derived causal knowledge with the actual causal structure. To address this issue, this paper proposes a novel LLM-driven causal discovery framework that limits LLM’s prior within a reliable range. Instead of pairwise causal reasoning that requires both precise and comprehensive output results, the LLM is directed to focus on each single aspect separately. By combining these distinct causal insights, a unified set of structural constraints is created, termed a harmonized prior, which draws on their respective strengths to ensure prior accuracy. On this basis, we introduce plug-and-play integrations of the harmonized prior into mainstream categories of structure learning methods, thereby enhancing their applicability in practical scenarios. Evaluations on real-world data demonstrate the effectiveness of our approach.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 4","pages":"1943-1960"},"PeriodicalIF":8.9000,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10839116/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Traditional domain-specific causal discovery relies on expert knowledge to guide the data-based structure learning process, thereby improving the reliability of recovered causality. Recent studies have shown promise in using the Large Language Model (LLM) as causal experts to construct autonomous expert-guided causal discovery systems through causal reasoning between pairwise variables. However, their performance is hampered by inaccuracies in aligning LLM-derived causal knowledge with the actual causal structure. To address this issue, this paper proposes a novel LLM-driven causal discovery framework that limits LLM’s prior within a reliable range. Instead of pairwise causal reasoning that requires both precise and comprehensive output results, the LLM is directed to focus on each single aspect separately. By combining these distinct causal insights, a unified set of structural constraints is created, termed a harmonized prior, which draws on their respective strengths to ensure prior accuracy. On this basis, we introduce plug-and-play integrations of the harmonized prior into mainstream categories of structure learning methods, thereby enhancing their applicability in practical scenarios. Evaluations on real-world data demonstrate the effectiveness of our approach.

查看原文本刊更多论文

通过协调先验的法学硕士驱动的因果发现

传统的特定领域因果关系发现依赖于专家知识来指导基于数据的结构学习过程，从而提高了因果关系恢复的可靠性。最近的研究表明，利用大语言模型（LLM）作为因果专家，通过两两变量之间的因果推理，构建自主的专家引导的因果发现系统是有希望的。然而，法学硕士衍生的因果知识与实际因果结构的不准确性阻碍了他们的表现。为了解决这个问题，本文提出了一个新的法学硕士驱动的因果发现框架，该框架将法学硕士的先验限制在一个可靠的范围内。与需要精确和全面输出结果的两两因果推理不同，LLM的目标是分别关注每个单独的方面。通过结合这些不同的因果洞察，创建了一组统一的结构约束，称为协调先验，它利用它们各自的优势来确保先验的准确性。在此基础上，我们将协调先验的即插即用集成引入主流的结构学习方法类别中，从而增强其在实际场景中的适用性。对真实世界数据的评估证明了我们方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Knowledge and Data Engineering 工程技术-工程：电子与电气

CiteScore

11.70

自引率

3.40%

发文量

515

审稿时长

6 months

期刊介绍： The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.