A Genetic Algorithm with Flexible Fitness Function for Feature Selection in Educational Data: Comparative Evaluation

Danielle F. de Albuquerque, Luís Tarrataca, Diego N. Brandão, R. Coutinho
{"title":"A Genetic Algorithm with Flexible Fitness Function for Feature Selection in Educational Data: Comparative Evaluation","authors":"Danielle F. de Albuquerque, Luís Tarrataca, Diego N. Brandão, R. Coutinho","doi":"10.5753/jidm.2022.2480","DOIUrl":null,"url":null,"abstract":"Educational Data Mining is an interdisciplinary field that helps understand educational phenomena through computational techniques. The databases of educational institutions are usually extensive, possessing many descriptive attributes that make the prediction process complex. In addition, the data can be sparse, redundant, irrelevant, and noisy, which can degrade the predictive quality of the models and affect computational performance. One way to simplify the problem is to identify the least important attributes and omit them from the modeling process. This can be performed by employing attribute selection techniques. This work evaluates different feature selection techniques applied to open educational data and paired alongside a genetic algorithm with a flexible fitness function. The methods and results described herein extend a previously published paper by: (i) describing a larger set of computational experiments; (ii) performing a hypothesis test over different classifiers; and (iii) presenting a more in-depth literature revision. The results obtained indicate an improvement in the classification process.","PeriodicalId":301338,"journal":{"name":"J. Inf. Data Manag.","volume":"27 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Inf. Data Manag.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5753/jidm.2022.2480","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Educational Data Mining is an interdisciplinary field that helps understand educational phenomena through computational techniques. The databases of educational institutions are usually extensive, possessing many descriptive attributes that make the prediction process complex. In addition, the data can be sparse, redundant, irrelevant, and noisy, which can degrade the predictive quality of the models and affect computational performance. One way to simplify the problem is to identify the least important attributes and omit them from the modeling process. This can be performed by employing attribute selection techniques. This work evaluates different feature selection techniques applied to open educational data and paired alongside a genetic algorithm with a flexible fitness function. The methods and results described herein extend a previously published paper by: (i) describing a larger set of computational experiments; (ii) performing a hypothesis test over different classifiers; and (iii) presenting a more in-depth literature revision. The results obtained indicate an improvement in the classification process.
具有灵活适应度函数的遗传算法在教育数据特征选择中的应用:比较评价
教育数据挖掘是一个跨学科领域,通过计算技术帮助理解教育现象。教育机构的数据库通常是广泛的,具有许多描述性属性,使预测过程复杂。此外,数据可能是稀疏的、冗余的、不相关的和有噪声的,这会降低模型的预测质量并影响计算性能。简化问题的一种方法是识别最不重要的属性,并从建模过程中忽略它们。这可以通过使用属性选择技术来实现。这项工作评估了应用于开放教育数据的不同特征选择技术,并与具有灵活适应度函数的遗传算法配对。本文描述的方法和结果扩展了先前发表的论文:(i)描述了一组更大的计算实验;(ii)对不同的分类器进行假设检验;(三)进行更深入的文献修订。所得结果表明分类过程有了改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信