基于随机森林方法的辍学预测数据挖掘实现

2020 8th International Conference on Information and Communication Technology (ICoICT) Pub Date : 2020-06-01 DOI:10.1109/ICoICT49345.2020.9166276

Meylani Utari, B. Warsito, R. Kusumaningrum

{"title":"基于随机森林方法的辍学预测数据挖掘实现","authors":"Meylani Utari, B. Warsito, R. Kusumaningrum","doi":"10.1109/ICoICT49345.2020.9166276","DOIUrl":null,"url":null,"abstract":"Accreditation is one of the quality measurements for a University. Some elements of these measurements are students and graduate students. Prevention of students to drop out is a problem that is considered very important for the university itself. High levels of drop out students will have a bad impact on the university, such as bad reputation or low-grade accreditation. This research presenting the results of a case study analysis in educational data, by analyzing the data using the data mining technique. The author using the classification method, that focuses on drop-out prediction of undergraduate and diploma students at the ABC Faculty at XYZ University. To predict drop-out classification, academic data are needed. The raw data are student’s academic data that enroll in university from 2008 to 2012. The raw data preprocessing then carried out to handle imbalanced data. This research uses synthetic minority oversampling technique (SMOTE) to handle imbalance dataset and random forest algorithm to predict drop-out within 2492 data. As a research result, the random forest algorithm accompanied by SMOTE can provide the best accuracy results by 93.43%. The main results of this research can be used to reduce drop-out levels by predicting potential drop out students and identifying potential factors related to drop out students.","PeriodicalId":113108,"journal":{"name":"2020 8th International Conference on Information and Communication Technology (ICoICT)","volume":"118 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Implementation of Data Mining for Drop-Out Prediction using Random Forest Method\",\"authors\":\"Meylani Utari, B. Warsito, R. Kusumaningrum\",\"doi\":\"10.1109/ICoICT49345.2020.9166276\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Accreditation is one of the quality measurements for a University. Some elements of these measurements are students and graduate students. Prevention of students to drop out is a problem that is considered very important for the university itself. High levels of drop out students will have a bad impact on the university, such as bad reputation or low-grade accreditation. This research presenting the results of a case study analysis in educational data, by analyzing the data using the data mining technique. The author using the classification method, that focuses on drop-out prediction of undergraduate and diploma students at the ABC Faculty at XYZ University. To predict drop-out classification, academic data are needed. The raw data are student’s academic data that enroll in university from 2008 to 2012. The raw data preprocessing then carried out to handle imbalanced data. This research uses synthetic minority oversampling technique (SMOTE) to handle imbalance dataset and random forest algorithm to predict drop-out within 2492 data. As a research result, the random forest algorithm accompanied by SMOTE can provide the best accuracy results by 93.43%. The main results of this research can be used to reduce drop-out levels by predicting potential drop out students and identifying potential factors related to drop out students.\",\"PeriodicalId\":113108,\"journal\":{\"name\":\"2020 8th International Conference on Information and Communication Technology (ICoICT)\",\"volume\":\"118 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 8th International Conference on Information and Communication Technology (ICoICT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICoICT49345.2020.9166276\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 8th International Conference on Information and Communication Technology (ICoICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICoICT49345.2020.9166276","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 13

摘要

认证是衡量大学质量的标准之一。这些测量的一些元素是学生和研究生。防止学生辍学是一个对大学本身来说非常重要的问题。高水平的退学学生将对大学产生不良影响，如声誉不好或低等级认证。本研究采用数据挖掘技术对教育数据进行分析，并给出了案例分析的结果。本文采用分类方法，对XYZ大学ABC学院本科生和毕业班学生的退学预测进行了研究。为了预测退学分类，需要学术数据。原始数据为2008年至2012年大学入学学生的学业数据。然后对原始数据进行预处理，处理不平衡数据。本研究采用合成少数派过采样技术(SMOTE)处理不平衡数据集，采用随机森林算法预测2492个数据内的drop-out。研究结果表明，随机森林算法与SMOTE算法相结合，准确率达到93.43%。本研究的主要结果可以通过预测潜在的辍学学生和识别与辍学学生相关的潜在因素来降低辍学水平。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Implementation of Data Mining for Drop-Out Prediction using Random Forest Method

Accreditation is one of the quality measurements for a University. Some elements of these measurements are students and graduate students. Prevention of students to drop out is a problem that is considered very important for the university itself. High levels of drop out students will have a bad impact on the university, such as bad reputation or low-grade accreditation. This research presenting the results of a case study analysis in educational data, by analyzing the data using the data mining technique. The author using the classification method, that focuses on drop-out prediction of undergraduate and diploma students at the ABC Faculty at XYZ University. To predict drop-out classification, academic data are needed. The raw data are student’s academic data that enroll in university from 2008 to 2012. The raw data preprocessing then carried out to handle imbalanced data. This research uses synthetic minority oversampling technique (SMOTE) to handle imbalance dataset and random forest algorithm to predict drop-out within 2492 data. As a research result, the random forest algorithm accompanied by SMOTE can provide the best accuracy results by 93.43%. The main results of this research can be used to reduce drop-out levels by predicting potential drop out students and identifying potential factors related to drop out students.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 8th International Conference on Information and Communication Technology (ICoICT)

自引率

0.00%

发文量