Sven Meier, Pratik Narendra Raut, Felix Mahr, Nils Thielen, Jörg Franke, Florian Risch
{"title":"结构化的基于知识的因果发现:思想的代理流","authors":"Sven Meier, Pratik Narendra Raut, Felix Mahr, Nils Thielen, Jörg Franke, Florian Risch","doi":"10.1016/j.ipm.2025.104202","DOIUrl":null,"url":null,"abstract":"<div><div>Causal discovery—the systematic identification of cause-and-effect relationships among variables—forms the cornerstone of causal inference. Its application enables reliable predictions and targeted interventions across complex systems, from medical treatments to engineering processes. Traditional statistical causal discovery methods face significant limitations with high-dimensional data structures, while existing knowledge-based approaches rely on single large-scale models that raise fundamental concerns about computational efficiency and result reliability. The Agentic Stream of Thought (ASoT) addresses these limitations through a novel architecture that orchestrates multiple smaller open-source language models. The framework integrates hierarchical query decomposition with Model Compiler refinement, while dual-stream thought processing enables balanced analysis through parallel evaluation of competing hypotheses. Dedicated Direction and Transitive Processors enhance reasoning by resolving bidirectional relationships and refining transitive pathways. A two-tiered quality gate system and complementary consensus mechanisms—Delphi protocol and Ensemble Synthesis Method—iteratively refine outputs while mitigating hallucination risks. Empirical evaluations across causal discovery benchmarks and question-answering tasks demonstrate that this approach matches or exceeds state-of-the-art models while enabling local deployment, establishing that sophisticated orchestration of smaller models provides a more sustainable path than increasing model scale alone.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104202"},"PeriodicalIF":7.4000,"publicationDate":"2025-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Structured knowledge-based causal discovery: Agentic streams of thought\",\"authors\":\"Sven Meier, Pratik Narendra Raut, Felix Mahr, Nils Thielen, Jörg Franke, Florian Risch\",\"doi\":\"10.1016/j.ipm.2025.104202\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Causal discovery—the systematic identification of cause-and-effect relationships among variables—forms the cornerstone of causal inference. Its application enables reliable predictions and targeted interventions across complex systems, from medical treatments to engineering processes. Traditional statistical causal discovery methods face significant limitations with high-dimensional data structures, while existing knowledge-based approaches rely on single large-scale models that raise fundamental concerns about computational efficiency and result reliability. The Agentic Stream of Thought (ASoT) addresses these limitations through a novel architecture that orchestrates multiple smaller open-source language models. The framework integrates hierarchical query decomposition with Model Compiler refinement, while dual-stream thought processing enables balanced analysis through parallel evaluation of competing hypotheses. Dedicated Direction and Transitive Processors enhance reasoning by resolving bidirectional relationships and refining transitive pathways. A two-tiered quality gate system and complementary consensus mechanisms—Delphi protocol and Ensemble Synthesis Method—iteratively refine outputs while mitigating hallucination risks. Empirical evaluations across causal discovery benchmarks and question-answering tasks demonstrate that this approach matches or exceeds state-of-the-art models while enabling local deployment, establishing that sophisticated orchestration of smaller models provides a more sustainable path than increasing model scale alone.</div></div>\",\"PeriodicalId\":50365,\"journal\":{\"name\":\"Information Processing & Management\",\"volume\":\"62 5\",\"pages\":\"Article 104202\"},\"PeriodicalIF\":7.4000,\"publicationDate\":\"2025-05-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Processing & Management\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0306457325001438\",\"RegionNum\":1,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457325001438","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
摘要
因果发现——系统地识别变量之间的因果关系——是因果推理的基石。它的应用可以实现从医疗到工程过程等复杂系统的可靠预测和有针对性的干预。传统的统计因果发现方法在高维数据结构中面临显著的局限性,而现有的基于知识的方法依赖于单一的大规模模型,这引起了对计算效率和结果可靠性的基本担忧。agent Stream of Thought (ASoT)通过编排多个较小的开源语言模型的新颖体系结构解决了这些限制。该框架将分层查询分解与模型编译器优化相结合,而双流思维处理通过并行评估相互竞争的假设来实现平衡分析。专用方向和传递处理器通过解决双向关系和精炼传递路径来增强推理能力。两层质量门系统和互补的共识机制-德尔菲协议和集成综合方法-迭代地改进输出,同时减轻幻觉风险。跨因果发现基准和问答任务的经验评估表明,该方法在支持本地部署的同时,匹配或超过了最先进的模型,建立了小型模型的复杂编排,提供了比单独增加模型规模更可持续的路径。
Structured knowledge-based causal discovery: Agentic streams of thought
Causal discovery—the systematic identification of cause-and-effect relationships among variables—forms the cornerstone of causal inference. Its application enables reliable predictions and targeted interventions across complex systems, from medical treatments to engineering processes. Traditional statistical causal discovery methods face significant limitations with high-dimensional data structures, while existing knowledge-based approaches rely on single large-scale models that raise fundamental concerns about computational efficiency and result reliability. The Agentic Stream of Thought (ASoT) addresses these limitations through a novel architecture that orchestrates multiple smaller open-source language models. The framework integrates hierarchical query decomposition with Model Compiler refinement, while dual-stream thought processing enables balanced analysis through parallel evaluation of competing hypotheses. Dedicated Direction and Transitive Processors enhance reasoning by resolving bidirectional relationships and refining transitive pathways. A two-tiered quality gate system and complementary consensus mechanisms—Delphi protocol and Ensemble Synthesis Method—iteratively refine outputs while mitigating hallucination risks. Empirical evaluations across causal discovery benchmarks and question-answering tasks demonstrate that this approach matches or exceeds state-of-the-art models while enabling local deployment, establishing that sophisticated orchestration of smaller models provides a more sustainable path than increasing model scale alone.
期刊介绍:
Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing.
We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.