Student performance predictor using multiclass support vector classification algorithm

Suhas Athani, Sharath A Kodli, Mayur N Banavasi, P. Hiremath
{"title":"Student performance predictor using multiclass support vector classification algorithm","authors":"Suhas Athani, Sharath A Kodli, Mayur N Banavasi, P. Hiremath","doi":"10.1109/CSPC.2017.8305866","DOIUrl":null,"url":null,"abstract":"Educational data mining provides the process of applying different data mining tools and techniques to analyze and visualize the data of an institution and can be used to discover a unique pattern of students' academic performance. Secondary schools are increasing rapidly from past years and progress of the institution can be measured based on student's success and failure rates. Failure rates can be measured in terms of a core subject such as mathematics which has been considered in this proposed system. Real data was collected using school reports and questionnaire method by the Portugal school which has been used for the project. The students will be classified according to the grades assigned for the range of marks scored by them. This involves classifying the students into five levels of grading system starting from grade ‘A’ to grade ‘F’ where grade ‘A’ represents the student getting the highest marks, grade ‘B’ being the second highest, grade ‘C’ being third, grade ‘D’ being the fourth and the grade ‘F’ which implies that the student has failed. To carry out this type of classification many machine learning algorithms can be used to implement it. Comparison was made between the algorithms like Multiclass Support Vector Machine and Neural Networks using the Weka tool. Based on the analysis carried out Multiclass Support Vector Machine showed prominent accuracy. The Multiclass Support Vector Machine is implemented on the basis of one-to-rest strategy use of class labels which is primarily an extension of linear Support Vector Machine. In order to get appropriate results in terms of the accuracy of the model, parameters like ‘C’ and ‘Gamma’ is tuned while implementing Multiclass Support Vector Machine. The result gives a good predictive accuracy based on the grades that are provided by the school. The accuracy of prediction made by Support Vector Machine Classifier is determined by using K-fold cross validation, according to which the accuracy is 89%.","PeriodicalId":123773,"journal":{"name":"2017 International Conference on Signal Processing and Communication (ICSPC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Signal Processing and Communication (ICSPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSPC.2017.8305866","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

Abstract

Educational data mining provides the process of applying different data mining tools and techniques to analyze and visualize the data of an institution and can be used to discover a unique pattern of students' academic performance. Secondary schools are increasing rapidly from past years and progress of the institution can be measured based on student's success and failure rates. Failure rates can be measured in terms of a core subject such as mathematics which has been considered in this proposed system. Real data was collected using school reports and questionnaire method by the Portugal school which has been used for the project. The students will be classified according to the grades assigned for the range of marks scored by them. This involves classifying the students into five levels of grading system starting from grade ‘A’ to grade ‘F’ where grade ‘A’ represents the student getting the highest marks, grade ‘B’ being the second highest, grade ‘C’ being third, grade ‘D’ being the fourth and the grade ‘F’ which implies that the student has failed. To carry out this type of classification many machine learning algorithms can be used to implement it. Comparison was made between the algorithms like Multiclass Support Vector Machine and Neural Networks using the Weka tool. Based on the analysis carried out Multiclass Support Vector Machine showed prominent accuracy. The Multiclass Support Vector Machine is implemented on the basis of one-to-rest strategy use of class labels which is primarily an extension of linear Support Vector Machine. In order to get appropriate results in terms of the accuracy of the model, parameters like ‘C’ and ‘Gamma’ is tuned while implementing Multiclass Support Vector Machine. The result gives a good predictive accuracy based on the grades that are provided by the school. The accuracy of prediction made by Support Vector Machine Classifier is determined by using K-fold cross validation, according to which the accuracy is 89%.
基于多类支持向量分类算法的学生成绩预测器
教育数据挖掘提供了应用不同的数据挖掘工具和技术来分析和可视化一个机构的数据的过程,可以用来发现学生学习成绩的独特模式。中学从过去几年开始迅速增长,学校的进步可以根据学生的成功率和失败率来衡量。不良率可以根据核心科目来衡量,比如数学,这在这个提议的系统中已经被考虑过了。本项目使用的葡萄牙学校采用学校报告和问卷调查法收集真实数据。学生将根据他们的分数范围被划分等级。这包括将学生分为从“A”级到“F”级的五个等级,其中“A”级代表学生获得最高分,“B”级是第二高,“C”级是第三,“D”级是第四,“F”级意味着学生失败。为了进行这种分类,可以使用许多机器学习算法来实现它。利用Weka工具对多类支持向量机算法和神经网络算法进行了比较。基于分析,多类支持向量机显示出了较好的准确率。多类支持向量机是基于类标签的一对休息策略实现的,它主要是线性支持向量机的扩展。为了在模型的准确性方面得到适当的结果,在实现多类支持向量机时对“C”和“Gamma”等参数进行了调整。基于学校提供的分数,该结果给出了很好的预测准确性。支持向量机分类器的预测精度通过K-fold交叉验证来确定,准确率为89%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信