Analyze Students Performance of a National Exam Using Feature Selection Methods

Hanieh Zehtab Hashemi, Parvaneh Parvasideh, Zahra Hasan Larijani, Fatemeh Moradi
{"title":"Analyze Students Performance of a National Exam Using Feature Selection Methods","authors":"Hanieh Zehtab Hashemi, Parvaneh Parvasideh, Zahra Hasan Larijani, Fatemeh Moradi","doi":"10.1109/ICCKE.2018.8566671","DOIUrl":null,"url":null,"abstract":"Recently, educational institutions are generating the mass of data and interesting to analyze these data for their applications. This purpose is achieved by data mining methods to extract knowledge required by the systems. This kind of dataset is usually huge and include many samples and unnecessary features. The nature of dataset implies that the analysis of data leads to inaccurate results without preprocessing. In this study, we want to find and evaluate the most important features by different feature selection methods. These methods give different results based on their nature. Therefore in the following, we evaluate obtained feature subsets with applying some machine learning methods. Here we use one educational dataset of an exam and want to construct a reliable model to predict the final outcome of this exam. We survey different feature selection and machine learning algorithms and find out the Information Gain and Gain Ratio yield better performance.","PeriodicalId":283700,"journal":{"name":"2018 8th International Conference on Computer and Knowledge Engineering (ICCKE)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 8th International Conference on Computer and Knowledge Engineering (ICCKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCKE.2018.8566671","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Recently, educational institutions are generating the mass of data and interesting to analyze these data for their applications. This purpose is achieved by data mining methods to extract knowledge required by the systems. This kind of dataset is usually huge and include many samples and unnecessary features. The nature of dataset implies that the analysis of data leads to inaccurate results without preprocessing. In this study, we want to find and evaluate the most important features by different feature selection methods. These methods give different results based on their nature. Therefore in the following, we evaluate obtained feature subsets with applying some machine learning methods. Here we use one educational dataset of an exam and want to construct a reliable model to predict the final outcome of this exam. We survey different feature selection and machine learning algorithms and find out the Information Gain and Gain Ratio yield better performance.
使用特征选择方法分析学生在国家考试中的表现
最近,教育机构正在生成大量的数据,并有兴趣分析这些数据以用于他们的应用。这一目的是通过数据挖掘方法提取系统所需的知识来实现的。这类数据集通常非常庞大,包含许多样本和不必要的特征。数据集的本质意味着,如果不进行预处理,对数据的分析将导致不准确的结果。在本研究中,我们希望通过不同的特征选择方法来发现和评估最重要的特征。这些方法根据其性质给出不同的结果。因此,在下文中,我们使用一些机器学习方法来评估得到的特征子集。在这里,我们使用一个考试的教育数据集,并希望构建一个可靠的模型来预测这次考试的最终结果。我们考察了不同的特征选择和机器学习算法,发现信息增益和增益比可以产生更好的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信