SVM 与决策树算法在识别 KIP 奖学金获得者资格方面的比较

Conference Series Pub Date : 2023-12-19 DOI:10.34306/conferenceseries.v4i1.625

Asriyanik, Agung Pambudi

{"title":"SVM 与决策树算法在识别 KIP 奖学金获得者资格方面的比较","authors":"Asriyanik, Agung Pambudi","doi":"10.34306/conferenceseries.v4i1.625","DOIUrl":null,"url":null,"abstract":"Scholarship selection process has specific rules, but if the number of applicants exceeds the quota, a selection process is needed. Based on the observation of a university in Sukabumi, the selection for KIP scholarship has not yet had a standard method. Several methods can be used to assist the selection process, such as classification based on historical data of applicants. The algorithms used for classification include Decision Tree (DT) and Support Vector Machine (SVM). The research process uses SEMMA (Sample, Explore, Modify, Model, Assess) method. Dataset for KIP scholarship awardee from 2021-2022 consist of 519 samples with 16 attributes. From the exploration results, the most important features for model modeling are Status DTKS, Status P3KE, Father's income, mother's income, combined income, and performance. These attributes are converted into numerical data to facilitate model fitting. The K-Fold Cross-Validation results for the Decision Tree model in the case of KIP Scholarship classification yield an accuracy of 78.44% for the entire test dataset, a precision of 0.73107, indicating that 73.11% of the predictions are true, a recall (sensitivity) of 78.45%, and an F1 score of 73.20%. The results for the SVM model are an accuracy of 80.17%, a precision of 84.44%, and a recall of 80.17%.","PeriodicalId":505674,"journal":{"name":"Conference Series","volume":"38 ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Comparative SVM and Decision Tree Algorithm in Identifying the Eligibility of KIP Scholarship Awardee\",\"authors\":\"Asriyanik, Agung Pambudi\",\"doi\":\"10.34306/conferenceseries.v4i1.625\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Scholarship selection process has specific rules, but if the number of applicants exceeds the quota, a selection process is needed. Based on the observation of a university in Sukabumi, the selection for KIP scholarship has not yet had a standard method. Several methods can be used to assist the selection process, such as classification based on historical data of applicants. The algorithms used for classification include Decision Tree (DT) and Support Vector Machine (SVM). The research process uses SEMMA (Sample, Explore, Modify, Model, Assess) method. Dataset for KIP scholarship awardee from 2021-2022 consist of 519 samples with 16 attributes. From the exploration results, the most important features for model modeling are Status DTKS, Status P3KE, Father's income, mother's income, combined income, and performance. These attributes are converted into numerical data to facilitate model fitting. The K-Fold Cross-Validation results for the Decision Tree model in the case of KIP Scholarship classification yield an accuracy of 78.44% for the entire test dataset, a precision of 0.73107, indicating that 73.11% of the predictions are true, a recall (sensitivity) of 78.45%, and an F1 score of 73.20%. The results for the SVM model are an accuracy of 80.17%, a precision of 84.44%, and a recall of 80.17%.\",\"PeriodicalId\":505674,\"journal\":{\"name\":\"Conference Series\",\"volume\":\"38 \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-12-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Conference Series\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.34306/conferenceseries.v4i1.625\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Conference Series","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.34306/conferenceseries.v4i1.625","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

奖学金的选拔程序有具体规定，但如果申请人数超过配额，就需要进行选拔。根据对苏卡布米一所大学的观察，KIP 奖学金的遴选还没有一个标准的方法。有几种方法可用于协助遴选过程，如根据申请者的历史数据进行分类。用于分类的算法包括决策树（DT）和支持向量机（SVM）。研究过程采用 SEMMA（取样、探索、修改、建模、评估）方法。2021-2022 年 KIP 奖学金获得者的数据集由 519 个样本组成，包含 16 个属性。从探索结果来看，最重要的建模特征是身份 DTKS、身份 P3KE、父亲收入、母亲收入、综合收入和表现。这些属性被转换成数字数据，以方便模型拟合。在 KIP 奖学金分类中，决策树模型的 K-Fold 交叉验证结果显示，整个测试数据集的准确率为 78.44%，精确度为 0.73107，表明 73.11% 的预测为真，召回率（灵敏度）为 78.45%，F1 得分为 73.20%。SVM 模型的准确率为 80.17%，精确率为 84.44%，召回率为 80.17%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Comparative SVM and Decision Tree Algorithm in Identifying the Eligibility of KIP Scholarship Awardee

Scholarship selection process has specific rules, but if the number of applicants exceeds the quota, a selection process is needed. Based on the observation of a university in Sukabumi, the selection for KIP scholarship has not yet had a standard method. Several methods can be used to assist the selection process, such as classification based on historical data of applicants. The algorithms used for classification include Decision Tree (DT) and Support Vector Machine (SVM). The research process uses SEMMA (Sample, Explore, Modify, Model, Assess) method. Dataset for KIP scholarship awardee from 2021-2022 consist of 519 samples with 16 attributes. From the exploration results, the most important features for model modeling are Status DTKS, Status P3KE, Father's income, mother's income, combined income, and performance. These attributes are converted into numerical data to facilitate model fitting. The K-Fold Cross-Validation results for the Decision Tree model in the case of KIP Scholarship classification yield an accuracy of 78.44% for the entire test dataset, a precision of 0.73107, indicating that 73.11% of the predictions are true, a recall (sensitivity) of 78.45%, and an F1 score of 73.20%. The results for the SVM model are an accuracy of 80.17%, a precision of 84.44%, and a recall of 80.17%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Conference Series

自引率

0.00%

发文量