Dissecting bias of ChatGPT in college major recommendations

Alex Zheng
{"title":"Dissecting bias of ChatGPT in college major recommendations","authors":"Alex Zheng","doi":"10.1007/s10799-024-00430-5","DOIUrl":null,"url":null,"abstract":"<p>Large language models (LLMs) such as ChatGPT play a crucial role in guiding critical decisions nowadays, such as in choosing a college major. Therefore, it is essential to assess the limitations of these models’ recommendations and understand any potential biases that may mislead human decisions. In this study, I investigate bias in terms of GPT-3.5 Turbo’s college major recommendations for students with various profiles, looking at demographic disparities in factors such as race, gender, and socioeconomic status, as well as educational disparities such as score percentiles. To conduct this analysis, I sourced public data for California seniors who have taken standardized tests like the California Standard Test (CAST) in 2023. By constructing prompts for the ChatGPT API, allowing the model to recommend majors based on high school student profiles, I evaluate bias using various metrics, including the Jaccard Coefficient, Wasserstein Metric, and STEM Disparity Score. The results of this study reveal a significant disparity in the set of recommended college majors, irrespective of the bias metric applied. Notably, the most pronounced disparities are observed for students who fall into minority categories, such as LGBTQ + , Hispanic, or the socioeconomically disadvantaged. Within these groups, ChatGPT demonstrates a lower likelihood of recommending STEM majors compared to a baseline scenario where these criteria are unspecified. For example, when employing the STEM Disparity Score metric, an LGBTQ + student scoring at the 50th percentile faces a 50% reduced chance of receiving a STEM major recommendation in comparison to a male student, with all other factors held constant. Additionally, an average Asian student is three times more likely to receive a STEM major recommendation than an African-American student. Meanwhile, students facing socioeconomic disadvantages have a 30% lower chance of being recommended a STEM major compared to their more privileged counterparts. These findings highlight the pressing need to acknowledge and rectify biases within language models, especially when they play a critical role in shaping personalized decisions. Addressing these disparities is essential to foster a more equitable educational and career environment for all students.</p>","PeriodicalId":13616,"journal":{"name":"Information Technology and Management","volume":"40 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Technology and Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s10799-024-00430-5","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Large language models (LLMs) such as ChatGPT play a crucial role in guiding critical decisions nowadays, such as in choosing a college major. Therefore, it is essential to assess the limitations of these models’ recommendations and understand any potential biases that may mislead human decisions. In this study, I investigate bias in terms of GPT-3.5 Turbo’s college major recommendations for students with various profiles, looking at demographic disparities in factors such as race, gender, and socioeconomic status, as well as educational disparities such as score percentiles. To conduct this analysis, I sourced public data for California seniors who have taken standardized tests like the California Standard Test (CAST) in 2023. By constructing prompts for the ChatGPT API, allowing the model to recommend majors based on high school student profiles, I evaluate bias using various metrics, including the Jaccard Coefficient, Wasserstein Metric, and STEM Disparity Score. The results of this study reveal a significant disparity in the set of recommended college majors, irrespective of the bias metric applied. Notably, the most pronounced disparities are observed for students who fall into minority categories, such as LGBTQ + , Hispanic, or the socioeconomically disadvantaged. Within these groups, ChatGPT demonstrates a lower likelihood of recommending STEM majors compared to a baseline scenario where these criteria are unspecified. For example, when employing the STEM Disparity Score metric, an LGBTQ + student scoring at the 50th percentile faces a 50% reduced chance of receiving a STEM major recommendation in comparison to a male student, with all other factors held constant. Additionally, an average Asian student is three times more likely to receive a STEM major recommendation than an African-American student. Meanwhile, students facing socioeconomic disadvantages have a 30% lower chance of being recommended a STEM major compared to their more privileged counterparts. These findings highlight the pressing need to acknowledge and rectify biases within language models, especially when they play a critical role in shaping personalized decisions. Addressing these disparities is essential to foster a more equitable educational and career environment for all students.

Abstract Image

剖析 ChatGPT 在大学专业推荐中的偏见
ChatGPT 等大型语言模型(LLM)在指导当今的关键决策(如选择大学专业)方面发挥着至关重要的作用。因此,有必要评估这些模型建议的局限性,并了解可能误导人类决策的任何潜在偏差。在本研究中,我调查了GPT-3.5 Turbo针对不同学生的大学专业建议的偏差,研究了种族、性别和社会经济地位等因素的人口差异,以及分数百分位数等教育差异。为了进行这项分析,我收集了 2023 年参加过加州标准测试(CAST)等标准化考试的加州高三学生的公开数据。通过为 ChatGPT API 构建提示,允许模型根据高中学生的情况推荐专业,我使用各种指标来评估偏差,包括杰卡德系数、瓦瑟斯坦指标和 STEM 差异得分。研究结果表明,无论采用哪种偏差指标,推荐的大学专业都存在显著差异。值得注意的是,属于少数群体的学生,如 LGBTQ +、西班牙裔或社会经济条件较差的学生,其差异最为明显。在这些群体中,与未指定这些标准的基线情景相比,ChatGPT 推荐 STEM 专业的可能性较低。例如,在其他因素保持不变的情况下,当采用 STEM 差异得分指标时,与男生相比,LGBTQ + 学生得分在第 50 百分位数,其获得 STEM 专业推荐的几率会降低 50%。此外,平均而言,亚裔学生获得 STEM 专业推荐的几率是非裔美国学生的三倍。与此同时,在社会经济方面处于不利地位的学生被推荐STEM专业的几率要比条件优越的学生低30%。这些发现凸显了承认和纠正语言模型中存在的偏见的迫切需要,尤其是当这些偏见在形成个性化决定方面起着至关重要的作用时。要为所有学生营造一个更加公平的教育和职业环境,解决这些差异至关重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信