Ensemble Implementation for Predicting Student Graduation with Classification Algorithm

R. Rismayati, Ismarmiaty Ismarmiaty, Syahroni Hidayat
{"title":"Ensemble Implementation for Predicting Student Graduation with Classification Algorithm","authors":"R. Rismayati, Ismarmiaty Ismarmiaty, Syahroni Hidayat","doi":"10.30812/ijecsa.v1i1.1805","DOIUrl":null,"url":null,"abstract":"Graduating on time at the higher education level is one of the main targets of every student and university institution. Many factors can affect a student's length of study, the different character of each student is also an internal factor that affects their study period. These characters are used in this study to classify data groups of students who graduated on time or not. Classification was chosen because it is able to find a model or pattern that can describe and distinguish classes in a dataset. This research method uses the esemble learning method which aims to see student graduation predictions using a dataset from Kaggle, the data used is a IPK dataset collected from a university in Indonesia which consists of 1687 records and 5 attributes where this dataset is not balanced. The intended target is whether the student is predicted to graduate on time or not. The method proposed in this study is Ensemble Learning Different Contribution Sampling (DCS) and the algorithms used include Logistic Regression, Decision Tree Classifier, Gaussian, Random Forest Classifier, Ada Bost Classifier, Support Vector Coefficient, KNeighbors Classifier and MLP Classifier. From each classification algorithm used, the test value and accuracy are calculated which are then compared between the algorithms. Based on the results of research that has been carried out, it is concluded that the best accuracy results are owned by the MLPClassifier algorithm with the ability to predict student graduation on time of 91.87%. The classification model provided by the DCS-LCA used does not give better results than the basic classifier of its constituent, namely the MLPClassifier algorithm of 91.87%, SVC of 91.64%, Logistic Regression of 91.46%, AdaBost Classifier of 90.87%, Random Forest Classifier of 90.45% , and KNN of 89.80%.","PeriodicalId":333075,"journal":{"name":"International Journal of Engineering and Computer Science Applications (IJECSA)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Engineering and Computer Science Applications (IJECSA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.30812/ijecsa.v1i1.1805","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Graduating on time at the higher education level is one of the main targets of every student and university institution. Many factors can affect a student's length of study, the different character of each student is also an internal factor that affects their study period. These characters are used in this study to classify data groups of students who graduated on time or not. Classification was chosen because it is able to find a model or pattern that can describe and distinguish classes in a dataset. This research method uses the esemble learning method which aims to see student graduation predictions using a dataset from Kaggle, the data used is a IPK dataset collected from a university in Indonesia which consists of 1687 records and 5 attributes where this dataset is not balanced. The intended target is whether the student is predicted to graduate on time or not. The method proposed in this study is Ensemble Learning Different Contribution Sampling (DCS) and the algorithms used include Logistic Regression, Decision Tree Classifier, Gaussian, Random Forest Classifier, Ada Bost Classifier, Support Vector Coefficient, KNeighbors Classifier and MLP Classifier. From each classification algorithm used, the test value and accuracy are calculated which are then compared between the algorithms. Based on the results of research that has been carried out, it is concluded that the best accuracy results are owned by the MLPClassifier algorithm with the ability to predict student graduation on time of 91.87%. The classification model provided by the DCS-LCA used does not give better results than the basic classifier of its constituent, namely the MLPClassifier algorithm of 91.87%, SVC of 91.64%, Logistic Regression of 91.46%, AdaBost Classifier of 90.87%, Random Forest Classifier of 90.45% , and KNN of 89.80%.
基于分类算法的学生毕业预测集成实现
在高等教育阶段,按时毕业是每个学生和大学机构的主要目标之一。影响学生学习时间长短的因素很多,每个学生性格的不同也是影响他们学习时间长短的内在因素。在本研究中使用这些字符来对按时毕业或不按时毕业的学生数据组进行分类。选择分类是因为它能够找到一个模型或模式,可以描述和区分数据集中的类别。本研究方法使用类似学习方法,旨在使用来自Kaggle的数据集来查看学生毕业预测,使用的数据是来自印度尼西亚一所大学的IPK数据集,该数据集由1687条记录和5个属性组成,其中该数据集不平衡。预期目标是学生是否能按时毕业。本文提出的方法是集成学习不同贡献抽样(DCS),使用的算法包括逻辑回归、决策树分类器、高斯、随机森林分类器、Ada主机分类器、支持向量系数、KNeighbors分类器和MLP分类器。从所使用的每种分类算法中计算测试值和精度,然后在算法之间进行比较。根据已经进行的研究结果,得出MLPClassifier算法的准确率最好,预测学生按时毕业的能力为91.87%。所使用的DCS-LCA提供的分类模型的分类效果并不优于其组成的基本分类器,即MLPClassifier算法的91.87%、SVC算法的91.64%、Logistic回归算法的91.46%、AdaBost分类器的90.87%、Random Forest分类器的90.45%和KNN算法的89.80%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信