Karen Braga Enes, Saulo Moraes Villela, G. Pappa, R. F. Neto
{"title":"An Approximative Bayes-Optimal Kernel Classifier Based on Version Space Reduction","authors":"Karen Braga Enes, Saulo Moraes Villela, G. Pappa, R. F. Neto","doi":"10.1109/ICMLA.2018.00071","DOIUrl":null,"url":null,"abstract":"The Bayes-optimal classifier is defined as a classifier that induces an hypothesis able to minimize the prediction error for any given sample in binary classification problems. Finding the Bayes-optimal classifier is an intractable problem. It is known that it is approximately equivalent to the center of mass of the version space, which is given by the set of all classifiers consistent with the training set. Previously solutions to find the center of mass are not feasible, as they present a high computational cost, and do not work properly in non-linear separable problems. Aiming to solve these problems, this paper presents the Dual Version Space Reduction Machine (Dual VSRM), an effective kernel method to approximate the center of mass of the version space. The Dual VSRM algorithm employs successive reductions of the version space based on an oracle's decision. As an oracle, we propose the Ensemble of Dissimilar Balanced Kernel Perceptrons (EBPK). EBPK enhances the accuracy of each individual classifier by balancing the final hyperplane solution while maximizing the diversity of its components by applying a dissimilarity measure. In order to evaluate the proposed methods, we conduct an experimental evaluation on 7 datasets. We compare the performance of our proposed methods against several baselines. Our results for EBKP indicate the strategies for improving individual accuracy and diversity of the ensemble components work properly. Also, the Dual VSRM consistently outperforms the baselines, indicating that the proposed method generates a better approximation to the center of mass.","PeriodicalId":74528,"journal":{"name":"Proceedings of the ... International Conference on Machine Learning and Applications. International Conference on Machine Learning and Applications","volume":"23 1","pages":"436-442"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... International Conference on Machine Learning and Applications. International Conference on Machine Learning and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2018.00071","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The Bayes-optimal classifier is defined as a classifier that induces an hypothesis able to minimize the prediction error for any given sample in binary classification problems. Finding the Bayes-optimal classifier is an intractable problem. It is known that it is approximately equivalent to the center of mass of the version space, which is given by the set of all classifiers consistent with the training set. Previously solutions to find the center of mass are not feasible, as they present a high computational cost, and do not work properly in non-linear separable problems. Aiming to solve these problems, this paper presents the Dual Version Space Reduction Machine (Dual VSRM), an effective kernel method to approximate the center of mass of the version space. The Dual VSRM algorithm employs successive reductions of the version space based on an oracle's decision. As an oracle, we propose the Ensemble of Dissimilar Balanced Kernel Perceptrons (EBPK). EBPK enhances the accuracy of each individual classifier by balancing the final hyperplane solution while maximizing the diversity of its components by applying a dissimilarity measure. In order to evaluate the proposed methods, we conduct an experimental evaluation on 7 datasets. We compare the performance of our proposed methods against several baselines. Our results for EBKP indicate the strategies for improving individual accuracy and diversity of the ensemble components work properly. Also, the Dual VSRM consistently outperforms the baselines, indicating that the proposed method generates a better approximation to the center of mass.