{"title":"从标记和未标记的数据中学习贝叶斯多维,用于知识表示","authors":"Meng Pang, Limin Wang, Qilong Li, Guo Lu, Kuo Li","doi":"10.3233/ida-227068","DOIUrl":null,"url":null,"abstract":"The Bayesian network classifiers (BNCs) learned from labeled training data are expected to generalize to fit unlabeled testing data based on the independent and identically distributed (i.i.d.) assumption, whereas the asymmetric independence assertion demonstrates the uncertainty of significance of dependency or independency relationships mined from data. A highly scalable BNC should form a distinct decision boundary that can be especially tailored to specific testing instance for knowledge representation. To address the issue of asymmetric independence assertion, in this paper we propose to learn k-dependence Bayesian multinet classifiers in the framework of multistage classification. By partitioning training set and pseudo training set according to high-confidence class labels, the dependency or independency relationships can be fully mined and represented in the topologies of the committee members. Extensive experimental results indicate that the proposed algorithm achieves competitive classification performance compared to single-topology BNCs (e.g., CFWNB, AIWNB and SKDB) and ensemble BNCs (e.g., WATAN, SA2DE, ATODE and SLB) in terms of zero-one loss, root mean square error (RMSE), Friedman test and Nemenyi test.","PeriodicalId":50355,"journal":{"name":"Intelligent Data Analysis","volume":"55 1","pages":"0"},"PeriodicalIF":0.9000,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Learning bayesian multinets from labeled and unlabeled data for knowledge representation\",\"authors\":\"Meng Pang, Limin Wang, Qilong Li, Guo Lu, Kuo Li\",\"doi\":\"10.3233/ida-227068\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Bayesian network classifiers (BNCs) learned from labeled training data are expected to generalize to fit unlabeled testing data based on the independent and identically distributed (i.i.d.) assumption, whereas the asymmetric independence assertion demonstrates the uncertainty of significance of dependency or independency relationships mined from data. A highly scalable BNC should form a distinct decision boundary that can be especially tailored to specific testing instance for knowledge representation. To address the issue of asymmetric independence assertion, in this paper we propose to learn k-dependence Bayesian multinet classifiers in the framework of multistage classification. By partitioning training set and pseudo training set according to high-confidence class labels, the dependency or independency relationships can be fully mined and represented in the topologies of the committee members. Extensive experimental results indicate that the proposed algorithm achieves competitive classification performance compared to single-topology BNCs (e.g., CFWNB, AIWNB and SKDB) and ensemble BNCs (e.g., WATAN, SA2DE, ATODE and SLB) in terms of zero-one loss, root mean square error (RMSE), Friedman test and Nemenyi test.\",\"PeriodicalId\":50355,\"journal\":{\"name\":\"Intelligent Data Analysis\",\"volume\":\"55 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.9000,\"publicationDate\":\"2023-10-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Intelligent Data Analysis\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3233/ida-227068\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligent Data Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/ida-227068","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Learning bayesian multinets from labeled and unlabeled data for knowledge representation
The Bayesian network classifiers (BNCs) learned from labeled training data are expected to generalize to fit unlabeled testing data based on the independent and identically distributed (i.i.d.) assumption, whereas the asymmetric independence assertion demonstrates the uncertainty of significance of dependency or independency relationships mined from data. A highly scalable BNC should form a distinct decision boundary that can be especially tailored to specific testing instance for knowledge representation. To address the issue of asymmetric independence assertion, in this paper we propose to learn k-dependence Bayesian multinet classifiers in the framework of multistage classification. By partitioning training set and pseudo training set according to high-confidence class labels, the dependency or independency relationships can be fully mined and represented in the topologies of the committee members. Extensive experimental results indicate that the proposed algorithm achieves competitive classification performance compared to single-topology BNCs (e.g., CFWNB, AIWNB and SKDB) and ensemble BNCs (e.g., WATAN, SA2DE, ATODE and SLB) in terms of zero-one loss, root mean square error (RMSE), Friedman test and Nemenyi test.
期刊介绍:
Intelligent Data Analysis provides a forum for the examination of issues related to the research and applications of Artificial Intelligence techniques in data analysis across a variety of disciplines. These techniques include (but are not limited to): all areas of data visualization, data pre-processing (fusion, editing, transformation, filtering, sampling), data engineering, database mining techniques, tools and applications, use of domain knowledge in data analysis, big data applications, evolutionary algorithms, machine learning, neural nets, fuzzy logic, statistical pattern recognition, knowledge filtering, and post-processing. In particular, papers are preferred that discuss development of new AI related data analysis architectures, methodologies, and techniques and their applications to various domains.