Sajad Khodarahmi Jahan Abad, Mohammad-Reza Zare-Mirakabad, M. Rezaeian
{"title":"An approach for classifying large dataset using ensemble classifiers","authors":"Sajad Khodarahmi Jahan Abad, Mohammad-Reza Zare-Mirakabad, M. Rezaeian","doi":"10.1109/ICCKE.2014.6993440","DOIUrl":null,"url":null,"abstract":"Efficiency of general classification models in various problems is different according to the characteristics and the space of the problem. Even in a particular issue, it may not be distinguished a special privilege for a classifier method than the others. Ensemble classifier methods aim to combine the results of several classifiers to cover the deficiency of each classifier by others. This combination faces high computational complexity if it includes a lazy base classifier, especially when handling large datasets. In this paper a method is proposed to combine the results of classifiers, which uses clustering as a part of the training, resulting in reducing the computational complexity, while it provides an acceptable accuracy. In this method the base classifiers are trained by a part of the input dataset, first. Then, according to the labels defined by the base classifiers, the clusters are created for another part of dataset. Finally, the samples contained in the clusters, the cluster that each sample belongs to it, and the distance of each sample to the center of all clusters are given to an artificial neural network and the final class label of test data is determined by the neural network. Experiments on several datasets show advantages of proposed model.","PeriodicalId":152540,"journal":{"name":"2014 4th International Conference on Computer and Knowledge Engineering (ICCKE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 4th International Conference on Computer and Knowledge Engineering (ICCKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCKE.2014.6993440","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Efficiency of general classification models in various problems is different according to the characteristics and the space of the problem. Even in a particular issue, it may not be distinguished a special privilege for a classifier method than the others. Ensemble classifier methods aim to combine the results of several classifiers to cover the deficiency of each classifier by others. This combination faces high computational complexity if it includes a lazy base classifier, especially when handling large datasets. In this paper a method is proposed to combine the results of classifiers, which uses clustering as a part of the training, resulting in reducing the computational complexity, while it provides an acceptable accuracy. In this method the base classifiers are trained by a part of the input dataset, first. Then, according to the labels defined by the base classifiers, the clusters are created for another part of dataset. Finally, the samples contained in the clusters, the cluster that each sample belongs to it, and the distance of each sample to the center of all clusters are given to an artificial neural network and the final class label of test data is determined by the neural network. Experiments on several datasets show advantages of proposed model.