属性选择对基于树的字母识别分类器的影响

Rizal Dwi Prayogo, N. Ikhsan
{"title":"属性选择对基于树的字母识别分类器的影响","authors":"Rizal Dwi Prayogo, N. Ikhsan","doi":"10.1109/DATABIA50434.2020.9190393","DOIUrl":null,"url":null,"abstract":"This study presents evaluation measures for attribute selection effect on classification performance in classifying the 26 uppercase letters in the English alphabet. Attribute selection is an essential method in the classification phase to measure the attribute significance related to the class label since not all attributes are significant for letter recognition. Therefore, insignificant attributes should be reduced by applying dimensionality reduction. The filter-based attribute selection methods using Information Gain, Gain Ratio, Correlation, and Chi-square are proposed. The performances of attribute selection are evaluated by tree-based classifiers using J48, CART, and Random Forest algorithms with the measures of accuracy, precision, recall, F-measure, and processing time. The results indicate that the use of attribute selection methods provides the increase of classification performances for letter recognition. The reduction of insignificant attributes is discussed in terms of the effect on classification accuracy and the processing time. The optimal number of selected attributes is determined for each attribute selection, it provides better classification accuracy with more time-efficient.","PeriodicalId":165106,"journal":{"name":"2020 International Conference on Data Science, Artificial Intelligence, and Business Analytics (DATABIA)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Attribute Selection Effect on Tree-Based Classifiers for Letter Recognition\",\"authors\":\"Rizal Dwi Prayogo, N. Ikhsan\",\"doi\":\"10.1109/DATABIA50434.2020.9190393\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This study presents evaluation measures for attribute selection effect on classification performance in classifying the 26 uppercase letters in the English alphabet. Attribute selection is an essential method in the classification phase to measure the attribute significance related to the class label since not all attributes are significant for letter recognition. Therefore, insignificant attributes should be reduced by applying dimensionality reduction. The filter-based attribute selection methods using Information Gain, Gain Ratio, Correlation, and Chi-square are proposed. The performances of attribute selection are evaluated by tree-based classifiers using J48, CART, and Random Forest algorithms with the measures of accuracy, precision, recall, F-measure, and processing time. The results indicate that the use of attribute selection methods provides the increase of classification performances for letter recognition. The reduction of insignificant attributes is discussed in terms of the effect on classification accuracy and the processing time. The optimal number of selected attributes is determined for each attribute selection, it provides better classification accuracy with more time-efficient.\",\"PeriodicalId\":165106,\"journal\":{\"name\":\"2020 International Conference on Data Science, Artificial Intelligence, and Business Analytics (DATABIA)\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference on Data Science, Artificial Intelligence, and Business Analytics (DATABIA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DATABIA50434.2020.9190393\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Data Science, Artificial Intelligence, and Business Analytics (DATABIA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DATABIA50434.2020.9190393","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

在对26个英文大写字母进行分类时,提出了属性选择对分类性能影响的评价方法。属性选择是分类阶段衡量与类标签相关的属性显著性的重要方法,因为并非所有属性对于字母识别都是显著的。因此,应该通过应用降维来减少不重要的属性。提出了基于信息增益、增益比、相关性和卡方的滤波属性选择方法。使用J48、CART和Random Forest算法的基于树的分类器评估属性选择的性能,并对准确性、精密度、召回率、F-measure和处理时间进行度量。结果表明,使用属性选择方法可以提高字母识别的分类性能。从对分类精度和处理时间的影响两方面讨论了不重要属性的约简。对于每个属性选择,确定了选择属性的最优数量,提供了更好的分类精度和更省时的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Attribute Selection Effect on Tree-Based Classifiers for Letter Recognition
This study presents evaluation measures for attribute selection effect on classification performance in classifying the 26 uppercase letters in the English alphabet. Attribute selection is an essential method in the classification phase to measure the attribute significance related to the class label since not all attributes are significant for letter recognition. Therefore, insignificant attributes should be reduced by applying dimensionality reduction. The filter-based attribute selection methods using Information Gain, Gain Ratio, Correlation, and Chi-square are proposed. The performances of attribute selection are evaluated by tree-based classifiers using J48, CART, and Random Forest algorithms with the measures of accuracy, precision, recall, F-measure, and processing time. The results indicate that the use of attribute selection methods provides the increase of classification performances for letter recognition. The reduction of insignificant attributes is discussed in terms of the effect on classification accuracy and the processing time. The optimal number of selected attributes is determined for each attribute selection, it provides better classification accuracy with more time-efficient.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信