从标记和未标记的数据中学习贝叶斯多维，用于知识表示

IF 0.8 4区计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Intelligent Data Analysis Pub Date : 2023-10-09 DOI:10.3233/ida-227068

Meng Pang, Limin Wang, Qilong Li, Guo Lu, Kuo Li

{"title":"从标记和未标记的数据中学习贝叶斯多维，用于知识表示","authors":"Meng Pang, Limin Wang, Qilong Li, Guo Lu, Kuo Li","doi":"10.3233/ida-227068","DOIUrl":null,"url":null,"abstract":"The Bayesian network classifiers (BNCs) learned from labeled training data are expected to generalize to fit unlabeled testing data based on the independent and identically distributed (i.i.d.) assumption, whereas the asymmetric independence assertion demonstrates the uncertainty of significance of dependency or independency relationships mined from data. A highly scalable BNC should form a distinct decision boundary that can be especially tailored to specific testing instance for knowledge representation. To address the issue of asymmetric independence assertion, in this paper we propose to learn k-dependence Bayesian multinet classifiers in the framework of multistage classification. By partitioning training set and pseudo training set according to high-confidence class labels, the dependency or independency relationships can be fully mined and represented in the topologies of the committee members. Extensive experimental results indicate that the proposed algorithm achieves competitive classification performance compared to single-topology BNCs (e.g., CFWNB, AIWNB and SKDB) and ensemble BNCs (e.g., WATAN, SA2DE, ATODE and SLB) in terms of zero-one loss, root mean square error (RMSE), Friedman test and Nemenyi test.","PeriodicalId":50355,"journal":{"name":"Intelligent Data Analysis","volume":"55 1","pages":"0"},"PeriodicalIF":0.8000,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Learning bayesian multinets from labeled and unlabeled data for knowledge representation\",\"authors\":\"Meng Pang, Limin Wang, Qilong Li, Guo Lu, Kuo Li\",\"doi\":\"10.3233/ida-227068\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Bayesian network classifiers (BNCs) learned from labeled training data are expected to generalize to fit unlabeled testing data based on the independent and identically distributed (i.i.d.) assumption, whereas the asymmetric independence assertion demonstrates the uncertainty of significance of dependency or independency relationships mined from data. A highly scalable BNC should form a distinct decision boundary that can be especially tailored to specific testing instance for knowledge representation. To address the issue of asymmetric independence assertion, in this paper we propose to learn k-dependence Bayesian multinet classifiers in the framework of multistage classification. By partitioning training set and pseudo training set according to high-confidence class labels, the dependency or independency relationships can be fully mined and represented in the topologies of the committee members. Extensive experimental results indicate that the proposed algorithm achieves competitive classification performance compared to single-topology BNCs (e.g., CFWNB, AIWNB and SKDB) and ensemble BNCs (e.g., WATAN, SA2DE, ATODE and SLB) in terms of zero-one loss, root mean square error (RMSE), Friedman test and Nemenyi test.\",\"PeriodicalId\":50355,\"journal\":{\"name\":\"Intelligent Data Analysis\",\"volume\":\"55 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.8000,\"publicationDate\":\"2023-10-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Intelligent Data Analysis\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3233/ida-227068\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligent Data Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/ida-227068","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

从标记训练数据中学习到的贝叶斯网络分类器(bcs)基于独立和同分布(i.i.d)的假设，有望泛化到拟合未标记的测试数据，而非对称独立性断言表明了从数据中挖掘的依赖或独立关系的重要性的不确定性。高度可扩展的BNC应该形成一个明确的决策边界，可以特别针对知识表示的特定测试实例进行定制。为了解决非对称独立性断言问题，本文提出在多阶段分类框架下学习k依赖贝叶斯多网分类器。通过根据高置信度的类标签划分训练集和伪训练集，可以充分挖掘和表示委员会成员拓扑结构中的依赖或独立关系。大量的实验结果表明，与单一拓扑bnc(如CFWNB、AIWNB和SKDB)和集成bnc(如WATAN、SA2DE、ATODE和SLB)相比，该算法在0 - 1损失、均方根误差(RMSE)、Friedman检验和Nemenyi检验方面取得了相当的分类性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Learning bayesian multinets from labeled and unlabeled data for knowledge representation

The Bayesian network classifiers (BNCs) learned from labeled training data are expected to generalize to fit unlabeled testing data based on the independent and identically distributed (i.i.d.) assumption, whereas the asymmetric independence assertion demonstrates the uncertainty of significance of dependency or independency relationships mined from data. A highly scalable BNC should form a distinct decision boundary that can be especially tailored to specific testing instance for knowledge representation. To address the issue of asymmetric independence assertion, in this paper we propose to learn k-dependence Bayesian multinet classifiers in the framework of multistage classification. By partitioning training set and pseudo training set according to high-confidence class labels, the dependency or independency relationships can be fully mined and represented in the topologies of the committee members. Extensive experimental results indicate that the proposed algorithm achieves competitive classification performance compared to single-topology BNCs (e.g., CFWNB, AIWNB and SKDB) and ensemble BNCs (e.g., WATAN, SA2DE, ATODE and SLB) in terms of zero-one loss, root mean square error (RMSE), Friedman test and Nemenyi test.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Intelligent Data Analysis 工程技术-计算机：人工智能

CiteScore

2.20

自引率

5.90%

发文量

审稿时长

3.3 months

期刊介绍： Intelligent Data Analysis provides a forum for the examination of issues related to the research and applications of Artificial Intelligence techniques in data analysis across a variety of disciplines. These techniques include (but are not limited to): all areas of data visualization, data pre-processing (fusion, editing, transformation, filtering, sampling), data engineering, database mining techniques, tools and applications, use of domain knowledge in data analysis, big data applications, evolutionary algorithms, machine learning, neural nets, fuzzy logic, statistical pattern recognition, knowledge filtering, and post-processing. In particular, papers are preferred that discuss development of new AI related data analysis architectures, methodologies, and techniques and their applications to various domains.