An approach for classifying large dataset using ensemble classifiers

Sajad Khodarahmi Jahan Abad, Mohammad-Reza Zare-Mirakabad, M. Rezaeian
{"title":"An approach for classifying large dataset using ensemble classifiers","authors":"Sajad Khodarahmi Jahan Abad, Mohammad-Reza Zare-Mirakabad, M. Rezaeian","doi":"10.1109/ICCKE.2014.6993440","DOIUrl":null,"url":null,"abstract":"Efficiency of general classification models in various problems is different according to the characteristics and the space of the problem. Even in a particular issue, it may not be distinguished a special privilege for a classifier method than the others. Ensemble classifier methods aim to combine the results of several classifiers to cover the deficiency of each classifier by others. This combination faces high computational complexity if it includes a lazy base classifier, especially when handling large datasets. In this paper a method is proposed to combine the results of classifiers, which uses clustering as a part of the training, resulting in reducing the computational complexity, while it provides an acceptable accuracy. In this method the base classifiers are trained by a part of the input dataset, first. Then, according to the labels defined by the base classifiers, the clusters are created for another part of dataset. Finally, the samples contained in the clusters, the cluster that each sample belongs to it, and the distance of each sample to the center of all clusters are given to an artificial neural network and the final class label of test data is determined by the neural network. Experiments on several datasets show advantages of proposed model.","PeriodicalId":152540,"journal":{"name":"2014 4th International Conference on Computer and Knowledge Engineering (ICCKE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 4th International Conference on Computer and Knowledge Engineering (ICCKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCKE.2014.6993440","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Efficiency of general classification models in various problems is different according to the characteristics and the space of the problem. Even in a particular issue, it may not be distinguished a special privilege for a classifier method than the others. Ensemble classifier methods aim to combine the results of several classifiers to cover the deficiency of each classifier by others. This combination faces high computational complexity if it includes a lazy base classifier, especially when handling large datasets. In this paper a method is proposed to combine the results of classifiers, which uses clustering as a part of the training, resulting in reducing the computational complexity, while it provides an acceptable accuracy. In this method the base classifiers are trained by a part of the input dataset, first. Then, according to the labels defined by the base classifiers, the clusters are created for another part of dataset. Finally, the samples contained in the clusters, the cluster that each sample belongs to it, and the distance of each sample to the center of all clusters are given to an artificial neural network and the final class label of test data is determined by the neural network. Experiments on several datasets show advantages of proposed model.
一种使用集成分类器对大型数据集进行分类的方法
根据问题的特点和空间的不同,一般分类模型在不同问题中的效率也不同。即使在特定问题中,也可能无法区分分类器方法比其他方法具有的特权。集成分类器方法旨在将多个分类器的结果结合起来,以弥补每个分类器的不足。如果包含惰性基分类器,这种组合将面临很高的计算复杂度,特别是在处理大型数据集时。本文提出了一种结合分类器结果的方法,该方法将聚类作为训练的一部分,从而降低了计算复杂度,同时提供了可接受的精度。在该方法中,首先使用一部分输入数据集来训练基分类器。然后,根据基分类器定义的标签,为另一部分数据集创建聚类。最后,将聚类中包含的样本、每个样本所属的聚类以及每个样本到所有聚类中心的距离交给人工神经网络,并由神经网络确定测试数据的最终类标号。在多个数据集上的实验表明了该模型的优越性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信