大学辍学预测模型是否应包括受保护的属性?

Renzhe Yu, Hansol Lee, René F. Kizilcec
{"title":"大学辍学预测模型是否应包括受保护的属性?","authors":"Renzhe Yu, Hansol Lee, René F. Kizilcec","doi":"10.1145/3430895.3460139","DOIUrl":null,"url":null,"abstract":"Early identification of college dropouts can provide tremendous value for improving student success and institutional effectiveness, and predictive analytics are increasingly used for this purpose. However, ethical concerns have emerged about whether including protected attributes in these prediction models discriminates against underrepresented student groups and exacerbates existing inequities. We examine this issue in the context of a large U.S. research university with both residential and fully online degree-seeking students. Based on comprehensive institutional records for the entire student population across multiple years (N = 93,457), we build machine learning models to predict student dropout after one academic year of study and compare the overall performance and fairness of model predictions with or without four protected attributes (gender, URM, first-generation student, and high financial need). We find that including protected attributes does not impact the overall prediction performance and it only marginally improves the algorithmic fairness of predictions. These findings suggest that including protected attributes is preferable. We offer guidance on how to evaluate the impact of including protected attributes in a local context, where institutional stakeholders seek to leverage predictive analytics to support student success.","PeriodicalId":125581,"journal":{"name":"Proceedings of the Eighth ACM Conference on Learning @ Scale","volume":"158 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"39","resultStr":"{\"title\":\"Should College Dropout Prediction Models Include Protected Attributes?\",\"authors\":\"Renzhe Yu, Hansol Lee, René F. Kizilcec\",\"doi\":\"10.1145/3430895.3460139\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Early identification of college dropouts can provide tremendous value for improving student success and institutional effectiveness, and predictive analytics are increasingly used for this purpose. However, ethical concerns have emerged about whether including protected attributes in these prediction models discriminates against underrepresented student groups and exacerbates existing inequities. We examine this issue in the context of a large U.S. research university with both residential and fully online degree-seeking students. Based on comprehensive institutional records for the entire student population across multiple years (N = 93,457), we build machine learning models to predict student dropout after one academic year of study and compare the overall performance and fairness of model predictions with or without four protected attributes (gender, URM, first-generation student, and high financial need). We find that including protected attributes does not impact the overall prediction performance and it only marginally improves the algorithmic fairness of predictions. These findings suggest that including protected attributes is preferable. We offer guidance on how to evaluate the impact of including protected attributes in a local context, where institutional stakeholders seek to leverage predictive analytics to support student success.\",\"PeriodicalId\":125581,\"journal\":{\"name\":\"Proceedings of the Eighth ACM Conference on Learning @ Scale\",\"volume\":\"158 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-03-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"39\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Eighth ACM Conference on Learning @ Scale\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3430895.3460139\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Eighth ACM Conference on Learning @ Scale","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3430895.3460139","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 39

摘要

及早识别大学辍学者可以为提高学生的成功率和机构的效率提供巨大的价值,预测分析法也越来越多地被用于这一目的。然而,在这些预测模型中加入受保护的属性是否会歧视代表性不足的学生群体并加剧现有的不公平现象,这引起了伦理方面的关注。我们以美国一所大型研究型大学为背景,对这一问题进行了研究,该大学既有住宿生,也有完全通过网络申请学位的学生。基于多年来所有学生的综合机构记录(N = 93,457),我们建立了机器学习模型来预测学生在一学年学习后的辍学情况,并比较了模型预测的整体性能和公平性,包括或不包括四个受保护的属性(性别、乌拉圭移民、第一代学生和高经济需求)。我们发现,加入受保护属性并不会影响整体预测性能,而且只是略微提高了预测的算法公平性。这些发现表明,包含受保护属性是可取的。在机构利益相关者寻求利用预测分析支持学生成功的情况下,我们为如何评估在本地环境中纳入受保护属性的影响提供了指导。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Should College Dropout Prediction Models Include Protected Attributes?
Early identification of college dropouts can provide tremendous value for improving student success and institutional effectiveness, and predictive analytics are increasingly used for this purpose. However, ethical concerns have emerged about whether including protected attributes in these prediction models discriminates against underrepresented student groups and exacerbates existing inequities. We examine this issue in the context of a large U.S. research university with both residential and fully online degree-seeking students. Based on comprehensive institutional records for the entire student population across multiple years (N = 93,457), we build machine learning models to predict student dropout after one academic year of study and compare the overall performance and fairness of model predictions with or without four protected attributes (gender, URM, first-generation student, and high financial need). We find that including protected attributes does not impact the overall prediction performance and it only marginally improves the algorithmic fairness of predictions. These findings suggest that including protected attributes is preferable. We offer guidance on how to evaluate the impact of including protected attributes in a local context, where institutional stakeholders seek to leverage predictive analytics to support student success.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信