比较数据分类算法为未来大学生的兴趣搜索

Budiman Budiman
{"title":"比较数据分类算法为未来大学生的兴趣搜索","authors":"Budiman Budiman","doi":"10.25134/nuansa.v15i2.4162","DOIUrl":null,"url":null,"abstract":"During the pandemic period, AMIK HASS faces difficulties to determine new student candidates. In order to attract public interest, the Marketing Department has implemented several strategies to attract prospective students to become new students. The data mining technique used in predicting is a classification that includes Naïve Bayes, J48 Decision Tree, and K-Nearest Neighbor. This study aims to perform a comparative analysis of data mining classification algorithms using WEKA tools. The method used in this study is CRISP-DM. The dataset used by the three classifications is 5.934 records with split mode, the percentage of testing is 70% as much as 4154 as training data and 30% as much as 1780 data as test data. Based on the test results on the three classification models, the highest accuracy value is obtained in the J48 Decision Tree classification, which has a value of 90.3%. While the K-Nearest Neighbor classification has a lower accuracy of 87.52% and the Naïve Bayes classification has an accuracy of 87.24%. The comparison of the AUROC J48 Decision Tree test results has the highest value of 0.9654 while the Naïve Bayes results are 0.9461 and the K-Nearest Neighbor results are 0.9343. The three classifications with ABK scores above 0.90 are included in the excellent classification category.","PeriodicalId":214195,"journal":{"name":"NUANSA INFORMATIKA","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Perbandingan Algoritma Klasifikasi Data Mining untuk Penelusuran Minat Calon Mahasiswa Baru\",\"authors\":\"Budiman Budiman\",\"doi\":\"10.25134/nuansa.v15i2.4162\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"During the pandemic period, AMIK HASS faces difficulties to determine new student candidates. In order to attract public interest, the Marketing Department has implemented several strategies to attract prospective students to become new students. The data mining technique used in predicting is a classification that includes Naïve Bayes, J48 Decision Tree, and K-Nearest Neighbor. This study aims to perform a comparative analysis of data mining classification algorithms using WEKA tools. The method used in this study is CRISP-DM. The dataset used by the three classifications is 5.934 records with split mode, the percentage of testing is 70% as much as 4154 as training data and 30% as much as 1780 data as test data. Based on the test results on the three classification models, the highest accuracy value is obtained in the J48 Decision Tree classification, which has a value of 90.3%. While the K-Nearest Neighbor classification has a lower accuracy of 87.52% and the Naïve Bayes classification has an accuracy of 87.24%. The comparison of the AUROC J48 Decision Tree test results has the highest value of 0.9654 while the Naïve Bayes results are 0.9461 and the K-Nearest Neighbor results are 0.9343. The three classifications with ABK scores above 0.90 are included in the excellent classification category.\",\"PeriodicalId\":214195,\"journal\":{\"name\":\"NUANSA INFORMATIKA\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-08-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"NUANSA INFORMATIKA\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.25134/nuansa.v15i2.4162\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"NUANSA INFORMATIKA","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.25134/nuansa.v15i2.4162","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

在流感大流行期间,海地科学院在确定新的学生候选人方面面临困难。为了吸引公众的兴趣,市场部实施了一些策略来吸引潜在的学生成为新生。用于预测的数据挖掘技术是一种分类,它包括Naïve贝叶斯、J48决策树和k近邻。本研究旨在对使用WEKA工具的数据挖掘分类算法进行比较分析。本研究采用的方法为CRISP-DM。三种分类使用的数据集为5.934条记录,采用分割模式,测试百分比为70%,训练数据为4154条,测试数据为1780条,测试数据为30%。从三种分类模型的测试结果来看,J48决策树分类的准确率值最高,达到90.3%。而k近邻分类的准确率较低,为87.52%,Naïve贝叶斯分类的准确率为87.24%。AUROC J48决策树测试结果的比较值最高,为0.9654,Naïve贝叶斯结果为0.9461,k -最近邻结果为0.9343。ABK得分在0.90以上的3个分类为优秀分类。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Perbandingan Algoritma Klasifikasi Data Mining untuk Penelusuran Minat Calon Mahasiswa Baru
During the pandemic period, AMIK HASS faces difficulties to determine new student candidates. In order to attract public interest, the Marketing Department has implemented several strategies to attract prospective students to become new students. The data mining technique used in predicting is a classification that includes Naïve Bayes, J48 Decision Tree, and K-Nearest Neighbor. This study aims to perform a comparative analysis of data mining classification algorithms using WEKA tools. The method used in this study is CRISP-DM. The dataset used by the three classifications is 5.934 records with split mode, the percentage of testing is 70% as much as 4154 as training data and 30% as much as 1780 data as test data. Based on the test results on the three classification models, the highest accuracy value is obtained in the J48 Decision Tree classification, which has a value of 90.3%. While the K-Nearest Neighbor classification has a lower accuracy of 87.52% and the Naïve Bayes classification has an accuracy of 87.24%. The comparison of the AUROC J48 Decision Tree test results has the highest value of 0.9654 while the Naïve Bayes results are 0.9461 and the K-Nearest Neighbor results are 0.9343. The three classifications with ABK scores above 0.90 are included in the excellent classification category.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信