学生成功预测模型中的算法偏差:两个案例研究

Feiya Xiang, Xinzhi Zhang, Jiali Cui, Morgan Carlin, Yang Song
{"title":"学生成功预测模型中的算法偏差:两个案例研究","authors":"Feiya Xiang, Xinzhi Zhang, Jiali Cui, Morgan Carlin, Yang Song","doi":"10.1109/TALE54877.2022.00058","DOIUrl":null,"url":null,"abstract":"Machine learning algorithms are increasingly being used in today’s society. However, growth in these algorithms means growth in algorithmic bias, and it is imperative that we work to understand the bias that may result. One area of study in which these algorithms are widely used is in educational institutions. The algorithms are often used to predict student success or retention. In our research, we aim to uncover the biases that may result from building and using a machine learning student success models. To do so, we used two publicly available student datasets from educational settings (one from a MOOC and another one from secondary education in Portugal) and built models of our own. We then compared the accuracy and the fairness of each model type to observe which models performed best on each subcategory of students. Among the models we built, we found that while it is easy to use accuracy to evaluate models and find the most accurate ones, the most accurate predictive model overall for a dataset may not perform fairly in predicting student success for all subcategories of students. To better tune models for fairness, we found that it is possible to tune models that also take fairness into consideration, and these models could perform more fairly on almost all subcategories of students, but it slightly took away from the accuracy of the algorithm. Our results demonstrate the importance of creating and tuning several model types in order to choose a balanced model that balances accuracy and fairness.","PeriodicalId":369501,"journal":{"name":"2022 IEEE International Conference on Teaching, Assessment and Learning for Engineering (TALE)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Algorithmic Bias in a Student Success Prediction Models: Two Case Studies\",\"authors\":\"Feiya Xiang, Xinzhi Zhang, Jiali Cui, Morgan Carlin, Yang Song\",\"doi\":\"10.1109/TALE54877.2022.00058\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Machine learning algorithms are increasingly being used in today’s society. However, growth in these algorithms means growth in algorithmic bias, and it is imperative that we work to understand the bias that may result. One area of study in which these algorithms are widely used is in educational institutions. The algorithms are often used to predict student success or retention. In our research, we aim to uncover the biases that may result from building and using a machine learning student success models. To do so, we used two publicly available student datasets from educational settings (one from a MOOC and another one from secondary education in Portugal) and built models of our own. We then compared the accuracy and the fairness of each model type to observe which models performed best on each subcategory of students. Among the models we built, we found that while it is easy to use accuracy to evaluate models and find the most accurate ones, the most accurate predictive model overall for a dataset may not perform fairly in predicting student success for all subcategories of students. To better tune models for fairness, we found that it is possible to tune models that also take fairness into consideration, and these models could perform more fairly on almost all subcategories of students, but it slightly took away from the accuracy of the algorithm. Our results demonstrate the importance of creating and tuning several model types in order to choose a balanced model that balances accuracy and fairness.\",\"PeriodicalId\":369501,\"journal\":{\"name\":\"2022 IEEE International Conference on Teaching, Assessment and Learning for Engineering (TALE)\",\"volume\":\"29 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference on Teaching, Assessment and Learning for Engineering (TALE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TALE54877.2022.00058\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Teaching, Assessment and Learning for Engineering (TALE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TALE54877.2022.00058","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

机器学习算法在当今社会的应用越来越广泛。然而,这些算法的发展意味着算法偏见的增长,我们必须努力理解可能导致的偏见。这些算法被广泛应用的一个研究领域是教育机构。这些算法通常被用来预测学生的成绩或留级。在我们的研究中,我们的目标是揭示在构建和使用机器学习学生成功模型时可能产生的偏见。为此,我们使用了来自教育机构的两个公开可用的学生数据集(一个来自MOOC,另一个来自葡萄牙的中学教育),并建立了我们自己的模型。然后,我们比较了每种模型类型的准确性和公平性,以观察哪种模型在学生的每个子类别上表现最好。在我们建立的模型中,我们发现,虽然使用准确性来评估模型并找到最准确的模型很容易,但对于一个数据集来说,最准确的预测模型在预测所有子类学生的学生成功方面可能表现不公平。为了更好地调整模型以实现公平性,我们发现可以调整同时考虑公平性的模型,这些模型可以在几乎所有学生的子类别上表现得更公平,但它稍微降低了算法的准确性。我们的结果证明了创建和调优几种模型类型的重要性,以便选择一个平衡准确性和公平性的平衡模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Algorithmic Bias in a Student Success Prediction Models: Two Case Studies
Machine learning algorithms are increasingly being used in today’s society. However, growth in these algorithms means growth in algorithmic bias, and it is imperative that we work to understand the bias that may result. One area of study in which these algorithms are widely used is in educational institutions. The algorithms are often used to predict student success or retention. In our research, we aim to uncover the biases that may result from building and using a machine learning student success models. To do so, we used two publicly available student datasets from educational settings (one from a MOOC and another one from secondary education in Portugal) and built models of our own. We then compared the accuracy and the fairness of each model type to observe which models performed best on each subcategory of students. Among the models we built, we found that while it is easy to use accuracy to evaluate models and find the most accurate ones, the most accurate predictive model overall for a dataset may not perform fairly in predicting student success for all subcategories of students. To better tune models for fairness, we found that it is possible to tune models that also take fairness into consideration, and these models could perform more fairly on almost all subcategories of students, but it slightly took away from the accuracy of the algorithm. Our results demonstrate the importance of creating and tuning several model types in order to choose a balanced model that balances accuracy and fairness.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信