学习具有多个潜在变量的贝叶斯网络，实现隐式关系表征

IF 2.8 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Data Mining and Knowledge Discovery Pub Date : 2024-02-22 DOI:10.1007/s10618-024-01012-3

Xinran Wu, Kun Yue, Liang Duan, Xiaodong Fu

{"title":"学习具有多个潜在变量的贝叶斯网络，实现隐式关系表征","authors":"Xinran Wu, Kun Yue, Liang Duan, Xiaodong Fu","doi":"10.1007/s10618-024-01012-3","DOIUrl":null,"url":null,"abstract":"<p>Artificial intelligence applications could be more powerful and comprehensive by incorporating the ability of inference, which could be achieved by probabilistic inference over implicit relations. It is significant yet challenging to represent implicit relations among observed variables and latent ones like disease etiologies and user preferences. In this paper, we propose the BN with multiple latent variables (MLBN) as the framework for representing the dependence relations, where multiple latent variables are incorporated to describe multi-dimensional abstract concepts. However, the efficiency of MLBN learning and effectiveness of MLBN based applications are still nontrivial due to the presence of multiple latent variables. To this end, we first propose the constraint induced and Spark based algorithm for MLBN learning, as well as several optimization strategies. Moreover, we present the concept of variation degree and further design a subgraph based algorithm for incremental learning of MLBN. Experimental results suggest that our proposed MLBN model could represent the dependence relations correctly. Our proposed method outperforms some state-of-the-art competitors for personalized recommendation, and facilitates some typical approaches to achieve better performance.</p>","PeriodicalId":55183,"journal":{"name":"Data Mining and Knowledge Discovery","volume":"94 24 1","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2024-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Learning a Bayesian network with multiple latent variables for implicit relation representation\",\"authors\":\"Xinran Wu, Kun Yue, Liang Duan, Xiaodong Fu\",\"doi\":\"10.1007/s10618-024-01012-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Artificial intelligence applications could be more powerful and comprehensive by incorporating the ability of inference, which could be achieved by probabilistic inference over implicit relations. It is significant yet challenging to represent implicit relations among observed variables and latent ones like disease etiologies and user preferences. In this paper, we propose the BN with multiple latent variables (MLBN) as the framework for representing the dependence relations, where multiple latent variables are incorporated to describe multi-dimensional abstract concepts. However, the efficiency of MLBN learning and effectiveness of MLBN based applications are still nontrivial due to the presence of multiple latent variables. To this end, we first propose the constraint induced and Spark based algorithm for MLBN learning, as well as several optimization strategies. Moreover, we present the concept of variation degree and further design a subgraph based algorithm for incremental learning of MLBN. Experimental results suggest that our proposed MLBN model could represent the dependence relations correctly. Our proposed method outperforms some state-of-the-art competitors for personalized recommendation, and facilitates some typical approaches to achieve better performance.</p>\",\"PeriodicalId\":55183,\"journal\":{\"name\":\"Data Mining and Knowledge Discovery\",\"volume\":\"94 24 1\",\"pages\":\"\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2024-02-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Data Mining and Knowledge Discovery\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s10618-024-01012-3\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data Mining and Knowledge Discovery","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10618-024-01012-3","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

人工智能应用可以通过结合推理能力变得更加强大和全面，而推理能力可以通过对隐含关系的概率推理来实现。要表示观察变量和潜在变量（如疾病病因和用户偏好）之间的隐含关系，意义重大却又充满挑战。在本文中，我们提出了具有多个潜变量的 BN（MLBN）作为表示依赖关系的框架，其中纳入了多个潜变量来描述多维抽象概念。然而，由于存在多个潜变量，MLBN 学习的效率和基于 MLBN 的应用的有效性仍是个难题。为此，我们首先提出了 MLBN 学习的约束诱导算法和基于 Spark 的算法，以及几种优化策略。此外，我们还提出了变异度的概念，并进一步设计了基于子图的 MLBN 增量学习算法。实验结果表明，我们提出的 MLBN 模型可以正确地表示依赖关系。在个性化推荐方面，我们提出的方法优于一些最先进的竞争对手，并有助于一些典型方法取得更好的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Learning a Bayesian network with multiple latent variables for implicit relation representation

查看原文本刊更多论文

Learning a Bayesian network with multiple latent variables for implicit relation representation

Artificial intelligence applications could be more powerful and comprehensive by incorporating the ability of inference, which could be achieved by probabilistic inference over implicit relations. It is significant yet challenging to represent implicit relations among observed variables and latent ones like disease etiologies and user preferences. In this paper, we propose the BN with multiple latent variables (MLBN) as the framework for representing the dependence relations, where multiple latent variables are incorporated to describe multi-dimensional abstract concepts. However, the efficiency of MLBN learning and effectiveness of MLBN based applications are still nontrivial due to the presence of multiple latent variables. To this end, we first propose the constraint induced and Spark based algorithm for MLBN learning, as well as several optimization strategies. Moreover, we present the concept of variation degree and further design a subgraph based algorithm for incremental learning of MLBN. Experimental results suggest that our proposed MLBN model could represent the dependence relations correctly. Our proposed method outperforms some state-of-the-art competitors for personalized recommendation, and facilitates some typical approaches to achieve better performance.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Data Mining and Knowledge Discovery 工程技术-计算机：人工智能

CiteScore

10.40

自引率

4.20%

发文量

审稿时长

10 months

期刊介绍： Advances in data gathering, storage, and distribution have created a need for computational tools and techniques to aid in data analysis. Data Mining and Knowledge Discovery in Databases (KDD) is a rapidly growing area of research and application that builds on techniques and theories from many fields, including statistics, databases, pattern recognition and learning, data visualization, uncertainty modelling, data warehousing and OLAP, optimization, and high performance computing.