实现从数据和知识中学习的有效实践

IF 3.2 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal of Approximate Reasoning Pub Date : 2024-04-05 DOI:10.1016/j.ijar.2024.109188

Yizuo Chen, Haiying Huang, Adnan Darwiche

{"title":"实现从数据和知识中学习的有效实践","authors":"Yizuo Chen, Haiying Huang, Adnan Darwiche","doi":"10.1016/j.ijar.2024.109188","DOIUrl":null,"url":null,"abstract":"<div>We discuss some recent advances on combining data and knowledge in the context of supervised learning using Bayesian networks. A first set of advances concern the computational efficiency of learning and inference, and they include a software-level boost based on compiling Bayesian network structures into tractable circuits in the form of tensor graphs, and algorithmic improvements based on exploiting a type of knowledge called unknown functional dependencies. The used tensor graphs capitalize on a highly optimized tensor operation (matrix multiplication) which brings orders of magnitude speedups in circuit training and evaluation. The exploitation of unknown functional dependencies yields exponential reductions in the size of tractable circuits and gives rise to the notion of causal treewidth for offering a corresponding complexity bound. Beyond computational efficiency, we discuss empirical evidence showing the promise of learning from a combination of data and knowledge, in terms of data hungriness and robustness against noise perturbations. Sometimes, however, an accurate Bayesian network structure may not be available due to the incompleteness of human knowledge, leading to modeling errors in the form of missing dependencies or missing variable values. On this front, we discuss another set of advances for recovering from certain types of modeling errors. This is achieved using Testing Bayesian networks which dynamically select parameters based on the input evidence, and come with theoretical guarantees on full recovery under certain conditions.</div>","PeriodicalId":13842,"journal":{"name":"International Journal of Approximate Reasoning","volume":"171 ","pages":"Article 109188"},"PeriodicalIF":3.2000,"publicationDate":"2024-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0888613X24000756/pdfft?md5=a14a683e66d7ef5d6aabb38b3d5cd7fa&pid=1-s2.0-S0888613X24000756-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Towards an effective practice of learning from data and knowledge\",\"authors\":\"Yizuo Chen, Haiying Huang, Adnan Darwiche\",\"doi\":\"10.1016/j.ijar.2024.109188\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>We discuss some recent advances on combining data and knowledge in the context of supervised learning using Bayesian networks. A first set of advances concern the computational efficiency of learning and inference, and they include a software-level boost based on compiling Bayesian network structures into tractable circuits in the form of tensor graphs, and algorithmic improvements based on exploiting a type of knowledge called unknown functional dependencies. The used tensor graphs capitalize on a highly optimized tensor operation (matrix multiplication) which brings orders of magnitude speedups in circuit training and evaluation. The exploitation of unknown functional dependencies yields exponential reductions in the size of tractable circuits and gives rise to the notion of causal treewidth for offering a corresponding complexity bound. Beyond computational efficiency, we discuss empirical evidence showing the promise of learning from a combination of data and knowledge, in terms of data hungriness and robustness against noise perturbations. Sometimes, however, an accurate Bayesian network structure may not be available due to the incompleteness of human knowledge, leading to modeling errors in the form of missing dependencies or missing variable values. On this front, we discuss another set of advances for recovering from certain types of modeling errors. This is achieved using Testing Bayesian networks which dynamically select parameters based on the input evidence, and come with theoretical guarantees on full recovery under certain conditions.</div>\",\"PeriodicalId\":13842,\"journal\":{\"name\":\"International Journal of Approximate Reasoning\",\"volume\":\"171 \",\"pages\":\"Article 109188\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2024-04-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S0888613X24000756/pdfft?md5=a14a683e66d7ef5d6aabb38b3d5cd7fa&pid=1-s2.0-S0888613X24000756-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Approximate Reasoning\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0888613X24000756\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Approximate Reasoning","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0888613X24000756","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

我们讨论了在使用贝叶斯网络进行监督学习的背景下将数据与知识相结合的一些最新进展。第一组进展涉及学习和推理的计算效率，其中包括基于将贝叶斯网络结构编译成 "贝叶斯图 "形式的可扩展电路的软件级提升，以及基于利用一种称为 "张量图 "的知识的算法改进。对未知功能依赖性的利用使可处理电路的规模呈指数级缩小，并产生了提供相应复杂度约束的 "复杂度 "概念。除了计算效率，我们还讨论了经验证据，这些证据表明，从数据饥渴度和对噪声扰动的鲁棒性来看，从数据和知识的结合中学习是大有可为的。不过，有时由于人类知识的不完整性，可能无法获得准确的贝叶斯网络结构，从而导致依赖关系缺失或变量值缺失。在这方面，我们讨论了从某些类型的建模错误中恢复的另一组进展。这是通过测试贝叶斯网络实现的，该网络可根据输入证据动态选择参数，并在某些条件下提供完全恢复的理论保证。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Towards an effective practice of learning from data and knowledge

We discuss some recent advances on combining data and knowledge in the context of supervised learning using Bayesian networks. A first set of advances concern the computational efficiency of learning and inference, and they include a software-level boost based on compiling Bayesian network structures into tractable circuits in the form of tensor graphs, and algorithmic improvements based on exploiting a type of knowledge called unknown functional dependencies. The used tensor graphs capitalize on a highly optimized tensor operation (matrix multiplication) which brings orders of magnitude speedups in circuit training and evaluation. The exploitation of unknown functional dependencies yields exponential reductions in the size of tractable circuits and gives rise to the notion of causal treewidth for offering a corresponding complexity bound. Beyond computational efficiency, we discuss empirical evidence showing the promise of learning from a combination of data and knowledge, in terms of data hungriness and robustness against noise perturbations. Sometimes, however, an accurate Bayesian network structure may not be available due to the incompleteness of human knowledge, leading to modeling errors in the form of missing dependencies or missing variable values. On this front, we discuss another set of advances for recovering from certain types of modeling errors. This is achieved using Testing Bayesian networks which dynamically select parameters based on the input evidence, and come with theoretical guarantees on full recovery under certain conditions.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Approximate Reasoning 工程技术-计算机：人工智能

CiteScore

6.90

自引率

12.80%

发文量

170

审稿时长

67 days

期刊介绍： The International Journal of Approximate Reasoning is intended to serve as a forum for the treatment of imprecision and uncertainty in Artificial and Computational Intelligence, covering both the foundations of uncertainty theories, and the design of intelligent systems for scientific and engineering applications. It publishes high-quality research papers describing theoretical developments or innovative applications, as well as review articles on topics of general interest. Relevant topics include, but are not limited to, probabilistic reasoning and Bayesian networks, imprecise probabilities, random sets, belief functions (Dempster-Shafer theory), possibility theory, fuzzy sets, rough sets, decision theory, non-additive measures and integrals, qualitative reasoning about uncertainty, comparative probability orderings, game-theoretic probability, default reasoning, nonstandard logics, argumentation systems, inconsistency tolerant reasoning, elicitation techniques, philosophical foundations and psychological models of uncertain reasoning. Domains of application for uncertain reasoning systems include risk analysis and assessment, information retrieval and database design, information fusion, machine learning, data and web mining, computer vision, image and signal processing, intelligent data analysis, statistics, multi-agent systems, etc.