Implementation of Feature Selection Strategies to Enhance Classification Using XGBoost and Decision Tree

Fhara Elvina Pingky Nadya, M.Firdaus Ibadi Ferdiansyah, Vinna Rahmayanti Setyaning Nastiti, Christian Sri Kusuma Aditya
{"title":"Implementation of Feature Selection Strategies to Enhance Classification Using XGBoost and Decision Tree","authors":"Fhara Elvina Pingky Nadya, M.Firdaus Ibadi Ferdiansyah, Vinna Rahmayanti Setyaning Nastiti, Christian Sri Kusuma Aditya","doi":"10.15294/sji.v11i1.48145","DOIUrl":null,"url":null,"abstract":"Purpose: Grades in the world of education are often a benchmark for students to be considered successful or not during the learning period. The facilities and teaching staff provided by schools with the same portion do not make student grades the same, the value gap is still found in every school. The purpose of this research is to produce a better accuracy rate by applying feature selection Information Gain (IG), Recursive Feature Elimination (RFE), Lasso, and Hybrid (RFE + Mutual Information) using XGBoost and Decision Tree models.Methods: This research was conducted using 649 Portuguese course student data that had been pre-processed according to data requirements, then, feature selection was carried out to select features that affect the target, after that all data can be classified using XGBoost and Decision tree, finally evaluating and displaying the results. Results: The results showed that feature selection Information Gain combined with the XGBoost algorithm has the best accuracy results compared to others, which is 81.53%.Novelty: The contribution of this research is to improve the classification accuracy results of previous research by using 2 traditional machine learning algorithms and some feature selection.","PeriodicalId":30781,"journal":{"name":"Scientific Journal of Informatics","volume":"32 5","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-02-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific Journal of Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15294/sji.v11i1.48145","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Purpose: Grades in the world of education are often a benchmark for students to be considered successful or not during the learning period. The facilities and teaching staff provided by schools with the same portion do not make student grades the same, the value gap is still found in every school. The purpose of this research is to produce a better accuracy rate by applying feature selection Information Gain (IG), Recursive Feature Elimination (RFE), Lasso, and Hybrid (RFE + Mutual Information) using XGBoost and Decision Tree models.Methods: This research was conducted using 649 Portuguese course student data that had been pre-processed according to data requirements, then, feature selection was carried out to select features that affect the target, after that all data can be classified using XGBoost and Decision tree, finally evaluating and displaying the results. Results: The results showed that feature selection Information Gain combined with the XGBoost algorithm has the best accuracy results compared to others, which is 81.53%.Novelty: The contribution of this research is to improve the classification accuracy results of previous research by using 2 traditional machine learning algorithms and some feature selection.
使用 XGBoost 和决策树实施特征选择策略以增强分类效果
目的:在教育界,成绩往往是学生在学习期间被视为成功与否的基准。同样分量的学校所提供的设施和师资并不能使学生的成绩相同,价值差距在每所学校仍然存在。本研究的目的是利用 XGBoost 和决策树模型,通过信息增益(IG)、递归特征消除(RFE)、Lasso 和混合(RFE + 互信息)特征选择,提高准确率:本研究使用了 649 个葡萄牙语课程学生数据,这些数据已根据数据要求进行了预处理,然后,进行了特征选择,以选出影响目标的特征,之后,所有数据都可以使用 XGBoost 和决策树进行分类,最后评估并显示结果。结果新颖性:本研究的贡献在于通过使用两种传统的机器学习算法和一些特征选择,提高了之前研究的分类准确率结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
13
审稿时长
24 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信