IMPLEMENTASI K-NEAREST NEIGHBORD PADA RAPIDMINER UNTUK PREDIKSI KELULUSAN MAHASISWA

High Education of Organization Archive Quality: Jurnal Teknologi Informasi Pub Date : 2018-05-31 DOI:10.52972/hoaq.vol10no1.p35-41

Sumarlin Sumarlin, Dewi Anggraini

{"title":"IMPLEMENTASI K-NEAREST NEIGHBORD PADA RAPIDMINER UNTUK PREDIKSI KELULUSAN MAHASISWA","authors":"Sumarlin Sumarlin, Dewi Anggraini","doi":"10.52972/hoaq.vol10no1.p35-41","DOIUrl":null,"url":null,"abstract":"Data on graduate students is an important part in determining the quality of a private and public university. Graduate data is included in important assessments in the accreditation process. Data from Uyelindo Kupang STIKOM graduates every year will continue to grow and accumulate like neglected data because it is rarely used. To maximize student data into information that can be used by universities, the data must be processed in this case used as training data in a study using data mining to obtain information in the form of predictions of graduation from Kupang Uyelindo STIKOM students. The method used in this study is K-Nearest Neighbor using rapidminer software to measure K-Nearest Neighbor's accuracy against student graduate data. The criteria used were in the form of student names, gender, cumulative achievement index (GPA) from semester 1 to 6. In applying the K-Nearest Neighbor algorithm can be used to produce predictions of student graduation. To measure the performance of the k-nearest neighbor algorithm, the Cross Validation, Confusion Matrix and ROC Curves methods are used, in this study using a 5-fold cross validation to predict student graduation. From 100 student dataset records Uyelindo Kupang STIKOM graduates obtained accuracy rate reached 82% and included a very good classification because it has an AUC value between 0.90-1.00, which is 0.971, so it can be concluded that the accuracy of testing of student graduation models using K-Nearest Neighbor (K-NN) algorithm is influenced by the number of data clusters. Accuracy and the highest AUC value of 5-fold validation is to cluster data k = 4 with the accuracy value of 90%.","PeriodicalId":193691,"journal":{"name":"High Education of Organization Archive Quality: Jurnal Teknologi Informasi","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"High Education of Organization Archive Quality: Jurnal Teknologi Informasi","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.52972/hoaq.vol10no1.p35-41","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Data on graduate students is an important part in determining the quality of a private and public university. Graduate data is included in important assessments in the accreditation process. Data from Uyelindo Kupang STIKOM graduates every year will continue to grow and accumulate like neglected data because it is rarely used. To maximize student data into information that can be used by universities, the data must be processed in this case used as training data in a study using data mining to obtain information in the form of predictions of graduation from Kupang Uyelindo STIKOM students. The method used in this study is K-Nearest Neighbor using rapidminer software to measure K-Nearest Neighbor's accuracy against student graduate data. The criteria used were in the form of student names, gender, cumulative achievement index (GPA) from semester 1 to 6. In applying the K-Nearest Neighbor algorithm can be used to produce predictions of student graduation. To measure the performance of the k-nearest neighbor algorithm, the Cross Validation, Confusion Matrix and ROC Curves methods are used, in this study using a 5-fold cross validation to predict student graduation. From 100 student dataset records Uyelindo Kupang STIKOM graduates obtained accuracy rate reached 82% and included a very good classification because it has an AUC value between 0.90-1.00, which is 0.971, so it can be concluded that the accuracy of testing of student graduation models using K-Nearest Neighbor (K-NN) algorithm is influenced by the number of data clusters. Accuracy and the highest AUC value of 5-fold validation is to cluster data k = 4 with the accuracy value of 90%.

查看原文本刊更多论文

最近为学生毕业预测的K-NEAREST目标

研究生的数据是决定私立和公立大学质量的重要组成部分。毕业生数据包括在认证过程中的重要评估。Uyelindo Kupang STIKOM毕业生的数据每年都会继续增长，并像被忽视的数据一样积累，因为它很少被使用。为了最大限度地将学生数据转化为大学可以使用的信息，在这种情况下，数据必须作为研究中的训练数据进行处理，使用数据挖掘来获取Kupang Uyelindo STIKOM学生毕业预测形式的信息。本研究中使用的方法是k近邻，使用rapidminer软件测量k近邻对学生毕业数据的准确性。使用的标准是学生姓名，性别，从第一学期到第六学期的累积成绩指数(GPA)。在应用k近邻算法可以用来产生学生毕业的预测。为了衡量k近邻算法的性能，使用交叉验证、混淆矩阵和ROC曲线方法，在本研究中使用5倍交叉验证来预测学生毕业。Uyelindo Kupang STIKOM毕业生的100个学生数据集记录中，准确率达到82%，并包含了非常好的分类，因为它的AUC值在0.90-1.00之间，即0.971，因此可以得出使用k -最近邻(K-NN)算法测试学生毕业模型的准确性受到数据聚类数量的影响。准确率和5倍验证的最高AUC值是对数据k = 4进行聚类，准确率值为90%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

High Education of Organization Archive Quality: Jurnal Teknologi Informasi

自引率

0.00%

发文量