学生成功预测模型中的算法偏差:两个案例研究

2022 IEEE International Conference on Teaching, Assessment and Learning for Engineering (TALE) Pub Date : 2022-12-01 DOI:10.1109/TALE54877.2022.00058

Feiya Xiang, Xinzhi Zhang, Jiali Cui, Morgan Carlin, Yang Song

{"title":"学生成功预测模型中的算法偏差:两个案例研究","authors":"Feiya Xiang, Xinzhi Zhang, Jiali Cui, Morgan Carlin, Yang Song","doi":"10.1109/TALE54877.2022.00058","DOIUrl":null,"url":null,"abstract":"Machine learning algorithms are increasingly being used in today’s society. However, growth in these algorithms means growth in algorithmic bias, and it is imperative that we work to understand the bias that may result. One area of study in which these algorithms are widely used is in educational institutions. The algorithms are often used to predict student success or retention. In our research, we aim to uncover the biases that may result from building and using a machine learning student success models. To do so, we used two publicly available student datasets from educational settings (one from a MOOC and another one from secondary education in Portugal) and built models of our own. We then compared the accuracy and the fairness of each model type to observe which models performed best on each subcategory of students. Among the models we built, we found that while it is easy to use accuracy to evaluate models and find the most accurate ones, the most accurate predictive model overall for a dataset may not perform fairly in predicting student success for all subcategories of students. To better tune models for fairness, we found that it is possible to tune models that also take fairness into consideration, and these models could perform more fairly on almost all subcategories of students, but it slightly took away from the accuracy of the algorithm. Our results demonstrate the importance of creating and tuning several model types in order to choose a balanced model that balances accuracy and fairness.","PeriodicalId":369501,"journal":{"name":"2022 IEEE International Conference on Teaching, Assessment and Learning for Engineering (TALE)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Algorithmic Bias in a Student Success Prediction Models: Two Case Studies\",\"authors\":\"Feiya Xiang, Xinzhi Zhang, Jiali Cui, Morgan Carlin, Yang Song\",\"doi\":\"10.1109/TALE54877.2022.00058\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Machine learning algorithms are increasingly being used in today’s society. However, growth in these algorithms means growth in algorithmic bias, and it is imperative that we work to understand the bias that may result. One area of study in which these algorithms are widely used is in educational institutions. The algorithms are often used to predict student success or retention. In our research, we aim to uncover the biases that may result from building and using a machine learning student success models. To do so, we used two publicly available student datasets from educational settings (one from a MOOC and another one from secondary education in Portugal) and built models of our own. We then compared the accuracy and the fairness of each model type to observe which models performed best on each subcategory of students. Among the models we built, we found that while it is easy to use accuracy to evaluate models and find the most accurate ones, the most accurate predictive model overall for a dataset may not perform fairly in predicting student success for all subcategories of students. To better tune models for fairness, we found that it is possible to tune models that also take fairness into consideration, and these models could perform more fairly on almost all subcategories of students, but it slightly took away from the accuracy of the algorithm. Our results demonstrate the importance of creating and tuning several model types in order to choose a balanced model that balances accuracy and fairness.\",\"PeriodicalId\":369501,\"journal\":{\"name\":\"2022 IEEE International Conference on Teaching, Assessment and Learning for Engineering (TALE)\",\"volume\":\"29 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference on Teaching, Assessment and Learning for Engineering (TALE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TALE54877.2022.00058\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Teaching, Assessment and Learning for Engineering (TALE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TALE54877.2022.00058","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

机器学习算法在当今社会的应用越来越广泛。然而，这些算法的发展意味着算法偏见的增长，我们必须努力理解可能导致的偏见。这些算法被广泛应用的一个研究领域是教育机构。这些算法通常被用来预测学生的成绩或留级。在我们的研究中，我们的目标是揭示在构建和使用机器学习学生成功模型时可能产生的偏见。为此，我们使用了来自教育机构的两个公开可用的学生数据集(一个来自MOOC，另一个来自葡萄牙的中学教育)，并建立了我们自己的模型。然后，我们比较了每种模型类型的准确性和公平性，以观察哪种模型在学生的每个子类别上表现最好。在我们建立的模型中，我们发现，虽然使用准确性来评估模型并找到最准确的模型很容易，但对于一个数据集来说，最准确的预测模型在预测所有子类学生的学生成功方面可能表现不公平。为了更好地调整模型以实现公平性，我们发现可以调整同时考虑公平性的模型，这些模型可以在几乎所有学生的子类别上表现得更公平，但它稍微降低了算法的准确性。我们的结果证明了创建和调优几种模型类型的重要性，以便选择一个平衡准确性和公平性的平衡模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Algorithmic Bias in a Student Success Prediction Models: Two Case Studies

Machine learning algorithms are increasingly being used in today’s society. However, growth in these algorithms means growth in algorithmic bias, and it is imperative that we work to understand the bias that may result. One area of study in which these algorithms are widely used is in educational institutions. The algorithms are often used to predict student success or retention. In our research, we aim to uncover the biases that may result from building and using a machine learning student success models. To do so, we used two publicly available student datasets from educational settings (one from a MOOC and another one from secondary education in Portugal) and built models of our own. We then compared the accuracy and the fairness of each model type to observe which models performed best on each subcategory of students. Among the models we built, we found that while it is easy to use accuracy to evaluate models and find the most accurate ones, the most accurate predictive model overall for a dataset may not perform fairly in predicting student success for all subcategories of students. To better tune models for fairness, we found that it is possible to tune models that also take fairness into consideration, and these models could perform more fairly on almost all subcategories of students, but it slightly took away from the accuracy of the algorithm. Our results demonstrate the importance of creating and tuning several model types in order to choose a balanced model that balances accuracy and fairness.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 IEEE International Conference on Teaching, Assessment and Learning for Engineering (TALE)

自引率

0.00%

发文量