Comparison of C4.5 and Naive Bayes Algorithm Methods in Prediction of Student Graduation on Time (Case Study: Information Systems Study Program)

Disty Dikriani, Alvina Tahta Indal Karim
{"title":"Comparison of C4.5 and Naive Bayes Algorithm Methods in Prediction of Student Graduation on Time (Case Study: Information Systems Study Program)","authors":"Disty Dikriani, Alvina Tahta Indal Karim","doi":"10.20895/dinda.v3i1.782","DOIUrl":null,"url":null,"abstract":"In tertiary institutions, students become one of the important parameters in the evaluation of study program organizers. Prediction of student graduation is a special concern to know, early identification for students is needed as an important action. Information processing to predict student graduation is by implementing data mining. The implementation of data mining can be applied if a university, especially a study program, does not yet have an early classification in achieving student graduation on time. The ITTP Information System study program is one of the study programs that does not have an early identification of student graduation on time. Determination of graduation for SI ITTP Study Program students includes GPA, TOEFL scores, and total credits. The purpose of this research is to find out which attributes have the most influence in predicting graduation of ITTP IS Study Program students. The method used in this prediction is by using the classification of the C4.5 Algorithm and Naïve Bayes. The classification is used to determine which attributes have an effect on predicting student graduation on time and to compare the two classification methods. The results obtained are the training set size 70% which has the best accuracy when compared to other training set sizes. Comparing the accuracy between the two methods, it is known that the C4.5 algorithm has good accuracy when training set size is 70% and Naïve Bayes has higher accuracy when training set size is 75%. Decision tree C4.5 interprets that the most influential attribute is the GPA as the root of the decision tree to predict student graduation on time. The research is expected to be used as a reference for the ITTP IS Study Program in formulating student graduation policies on time and as a reference for further researchers in predicting in the same field.","PeriodicalId":419119,"journal":{"name":"Journal of Dinda : Data Science, Information Technology, and Data Analytics","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Dinda : Data Science, Information Technology, and Data Analytics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.20895/dinda.v3i1.782","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

In tertiary institutions, students become one of the important parameters in the evaluation of study program organizers. Prediction of student graduation is a special concern to know, early identification for students is needed as an important action. Information processing to predict student graduation is by implementing data mining. The implementation of data mining can be applied if a university, especially a study program, does not yet have an early classification in achieving student graduation on time. The ITTP Information System study program is one of the study programs that does not have an early identification of student graduation on time. Determination of graduation for SI ITTP Study Program students includes GPA, TOEFL scores, and total credits. The purpose of this research is to find out which attributes have the most influence in predicting graduation of ITTP IS Study Program students. The method used in this prediction is by using the classification of the C4.5 Algorithm and Naïve Bayes. The classification is used to determine which attributes have an effect on predicting student graduation on time and to compare the two classification methods. The results obtained are the training set size 70% which has the best accuracy when compared to other training set sizes. Comparing the accuracy between the two methods, it is known that the C4.5 algorithm has good accuracy when training set size is 70% and Naïve Bayes has higher accuracy when training set size is 75%. Decision tree C4.5 interprets that the most influential attribute is the GPA as the root of the decision tree to predict student graduation on time. The research is expected to be used as a reference for the ITTP IS Study Program in formulating student graduation policies on time and as a reference for further researchers in predicting in the same field.
C4.5与朴素贝叶斯算法在学生按时毕业预测中的比较(以信息系统研究项目为例)
在高等院校,学生成为评价学习项目组织者的重要指标之一。预测学生毕业是一个需要特别关注的问题,提前识别学生是需要作为一个重要的行动。通过数据挖掘对学生毕业预测的信息处理。如果一所大学,特别是一个学习项目,在实现学生按时毕业方面还没有一个早期的分类,那么数据挖掘的实施就可以应用。ITTP信息系统学习项目是对学生按时毕业没有早期识别的学习项目之一。SI ITTP学习计划学生的毕业决定包括GPA,托福成绩和总学分。本研究的目的是找出哪些属性对预测ITTP is学习计划学生毕业影响最大。本次预测使用的方法是使用C4.5算法和Naïve贝叶斯的分类。该分类用于确定哪些属性对预测学生按时毕业有影响,并比较两种分类方法。得到的结果是训练集大小为70%,与其他训练集大小相比具有最好的准确性。对比两种方法的准确率可知,C4.5算法在训练集大小为70%时准确率较好,Naïve Bayes在训练集大小为75%时准确率较高。决策树C4.5将最具影响力的属性解释为GPA作为预测学生按时毕业的决策树的根。本研究可为ITTP is研修项目及时制定学生毕业政策提供参考,并为今后同类领域的研究人员预测提供参考。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信