{"title":"Bayesian feature construction for the improvement of classification performance","authors":"M. Maragoudakis","doi":"10.1504/ijdats.2020.10026826","DOIUrl":null,"url":null,"abstract":"In this paper we are going to talk about the problem of the increase in validity, concerning the process of classification, but not through approaches having to do with the improvement of the ability to construct a precise classification model using any algorithm of machine learning. On the contrary, we approach this important matter by the view of a wider encoding of the training data and more specifically under the perspective of the creation of more features so that the hidden angles of the subject areas, which model the available data, are revealed to a higher degree. We suggest the use of a novel feature construction algorithm, which is based on the ability of the Bayesian networks to re-enact the conditional independence assumptions of features, bringing forth properties concerning their interrelation that are not clear when a classifier provides the data in their initial form. The results from the increase of the features are shown through the experimental measurement in a wide domain area and after the use of a large number of classification algorithms, where the improvement of the performance of classification is evident.","PeriodicalId":38582,"journal":{"name":"International Journal of Data Analysis Techniques and Strategies","volume":"44 1","pages":"43-75"},"PeriodicalIF":0.0000,"publicationDate":"2020-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Data Analysis Techniques and Strategies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1504/ijdats.2020.10026826","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper we are going to talk about the problem of the increase in validity, concerning the process of classification, but not through approaches having to do with the improvement of the ability to construct a precise classification model using any algorithm of machine learning. On the contrary, we approach this important matter by the view of a wider encoding of the training data and more specifically under the perspective of the creation of more features so that the hidden angles of the subject areas, which model the available data, are revealed to a higher degree. We suggest the use of a novel feature construction algorithm, which is based on the ability of the Bayesian networks to re-enact the conditional independence assumptions of features, bringing forth properties concerning their interrelation that are not clear when a classifier provides the data in their initial form. The results from the increase of the features are shown through the experimental measurement in a wide domain area and after the use of a large number of classification algorithms, where the improvement of the performance of classification is evident.