SVM 与决策树算法在识别 KIP 奖学金获得者资格方面的比较

Asriyanik, Agung Pambudi
{"title":"SVM 与决策树算法在识别 KIP 奖学金获得者资格方面的比较","authors":"Asriyanik, Agung Pambudi","doi":"10.34306/conferenceseries.v4i1.625","DOIUrl":null,"url":null,"abstract":"Scholarship selection process has specific rules, but if the number of applicants exceeds the quota, a selection process is needed. Based on the observation of a university in Sukabumi, the selection for KIP scholarship has not yet had a standard method. Several methods can be used to assist the selection process, such as classification based on historical data of applicants. The algorithms used for classification include Decision Tree (DT) and Support Vector Machine (SVM). The research process uses SEMMA (Sample, Explore, Modify, Model, Assess) method. Dataset for KIP scholarship awardee from 2021-2022 consist of 519 samples with 16 attributes. From the exploration results, the most important features for model modeling are Status DTKS, Status P3KE, Father's income, mother's income, combined income, and performance. These attributes are converted into numerical data to facilitate model fitting. The K-Fold Cross-Validation results for the Decision Tree model in the case of KIP Scholarship classification yield an accuracy of 78.44% for the entire test dataset, a precision of 0.73107, indicating that 73.11% of the predictions are true, a recall (sensitivity) of 78.45%, and an F1 score of 73.20%. The results for the SVM model are an accuracy of 80.17%, a precision of 84.44%, and a recall of 80.17%.","PeriodicalId":505674,"journal":{"name":"Conference Series","volume":"38 ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Comparative SVM and Decision Tree Algorithm in Identifying the Eligibility of KIP Scholarship Awardee\",\"authors\":\"Asriyanik, Agung Pambudi\",\"doi\":\"10.34306/conferenceseries.v4i1.625\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Scholarship selection process has specific rules, but if the number of applicants exceeds the quota, a selection process is needed. Based on the observation of a university in Sukabumi, the selection for KIP scholarship has not yet had a standard method. Several methods can be used to assist the selection process, such as classification based on historical data of applicants. The algorithms used for classification include Decision Tree (DT) and Support Vector Machine (SVM). The research process uses SEMMA (Sample, Explore, Modify, Model, Assess) method. Dataset for KIP scholarship awardee from 2021-2022 consist of 519 samples with 16 attributes. From the exploration results, the most important features for model modeling are Status DTKS, Status P3KE, Father's income, mother's income, combined income, and performance. These attributes are converted into numerical data to facilitate model fitting. The K-Fold Cross-Validation results for the Decision Tree model in the case of KIP Scholarship classification yield an accuracy of 78.44% for the entire test dataset, a precision of 0.73107, indicating that 73.11% of the predictions are true, a recall (sensitivity) of 78.45%, and an F1 score of 73.20%. The results for the SVM model are an accuracy of 80.17%, a precision of 84.44%, and a recall of 80.17%.\",\"PeriodicalId\":505674,\"journal\":{\"name\":\"Conference Series\",\"volume\":\"38 \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-12-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Conference Series\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.34306/conferenceseries.v4i1.625\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Conference Series","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.34306/conferenceseries.v4i1.625","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

奖学金的选拔程序有具体规定,但如果申请人数超过配额,就需要进行选拔。根据对苏卡布米一所大学的观察,KIP 奖学金的遴选还没有一个标准的方法。有几种方法可用于协助遴选过程,如根据申请者的历史数据进行分类。用于分类的算法包括决策树(DT)和支持向量机(SVM)。研究过程采用 SEMMA(取样、探索、修改、建模、评估)方法。2021-2022 年 KIP 奖学金获得者的数据集由 519 个样本组成,包含 16 个属性。从探索结果来看,最重要的建模特征是身份 DTKS、身份 P3KE、父亲收入、母亲收入、综合收入和表现。这些属性被转换成数字数据,以方便模型拟合。在 KIP 奖学金分类中,决策树模型的 K-Fold 交叉验证结果显示,整个测试数据集的准确率为 78.44%,精确度为 0.73107,表明 73.11% 的预测为真,召回率(灵敏度)为 78.45%,F1 得分为 73.20%。SVM 模型的准确率为 80.17%,精确率为 84.44%,召回率为 80.17%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Comparative SVM and Decision Tree Algorithm in Identifying the Eligibility of KIP Scholarship Awardee
Scholarship selection process has specific rules, but if the number of applicants exceeds the quota, a selection process is needed. Based on the observation of a university in Sukabumi, the selection for KIP scholarship has not yet had a standard method. Several methods can be used to assist the selection process, such as classification based on historical data of applicants. The algorithms used for classification include Decision Tree (DT) and Support Vector Machine (SVM). The research process uses SEMMA (Sample, Explore, Modify, Model, Assess) method. Dataset for KIP scholarship awardee from 2021-2022 consist of 519 samples with 16 attributes. From the exploration results, the most important features for model modeling are Status DTKS, Status P3KE, Father's income, mother's income, combined income, and performance. These attributes are converted into numerical data to facilitate model fitting. The K-Fold Cross-Validation results for the Decision Tree model in the case of KIP Scholarship classification yield an accuracy of 78.44% for the entire test dataset, a precision of 0.73107, indicating that 73.11% of the predictions are true, a recall (sensitivity) of 78.45%, and an F1 score of 73.20%. The results for the SVM model are an accuracy of 80.17%, a precision of 84.44%, and a recall of 80.17%.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信