Computer says ‘no’: Exploring systemic bias in ChatGPT using an audit approach

Louis Lippens
{"title":"Computer says ‘no’: Exploring systemic bias in ChatGPT using an audit approach","authors":"Louis Lippens","doi":"10.1016/j.chbah.2024.100054","DOIUrl":null,"url":null,"abstract":"<div><p>Large language models offer significant potential for increasing labour productivity, such as streamlining personnel selection, but raise concerns about perpetuating systemic biases embedded into their pre-training data. This study explores the potential ethnic and gender bias of ChatGPT—a chatbot producing human-like responses to language tasks—in assessing job applicants. Using the correspondence audit approach from the social sciences, I simulated a CV screening task with 34,560 vacancy–CV combinations where the chatbot had to rate fictitious applicant profiles. Comparing ChatGPT's ratings of Arab, Asian, Black American, Central African, Dutch, Eastern European, Hispanic, Turkish, and White American male and female applicants, I show that ethnic and gender identity influence the chatbot's evaluations. Ethnic discrimination is more pronounced than gender discrimination and mainly occurs in jobs with favourable labour conditions or requiring greater language proficiency. In contrast, gender bias emerges in gender-atypical roles. These findings suggest that ChatGPT's discriminatory output reflects a statistical mechanism echoing societal stereotypes. Policymakers and developers should address systemic bias in language model-driven applications to ensure equitable treatment across demographic groups. Practitioners should practice caution, given the adverse impact these tools can (re)produce, especially in selection decisions involving humans.</p></div>","PeriodicalId":100324,"journal":{"name":"Computers in Human Behavior: Artificial Humans","volume":"2 1","pages":"Article 100054"},"PeriodicalIF":0.0000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949882124000148/pdfft?md5=1537d8a7b6f70ed502f954301b884704&pid=1-s2.0-S2949882124000148-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in Human Behavior: Artificial Humans","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949882124000148","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Large language models offer significant potential for increasing labour productivity, such as streamlining personnel selection, but raise concerns about perpetuating systemic biases embedded into their pre-training data. This study explores the potential ethnic and gender bias of ChatGPT—a chatbot producing human-like responses to language tasks—in assessing job applicants. Using the correspondence audit approach from the social sciences, I simulated a CV screening task with 34,560 vacancy–CV combinations where the chatbot had to rate fictitious applicant profiles. Comparing ChatGPT's ratings of Arab, Asian, Black American, Central African, Dutch, Eastern European, Hispanic, Turkish, and White American male and female applicants, I show that ethnic and gender identity influence the chatbot's evaluations. Ethnic discrimination is more pronounced than gender discrimination and mainly occurs in jobs with favourable labour conditions or requiring greater language proficiency. In contrast, gender bias emerges in gender-atypical roles. These findings suggest that ChatGPT's discriminatory output reflects a statistical mechanism echoing societal stereotypes. Policymakers and developers should address systemic bias in language model-driven applications to ensure equitable treatment across demographic groups. Practitioners should practice caution, given the adverse impact these tools can (re)produce, especially in selection decisions involving humans.

电脑说 "不":使用审计方法探索 ChatGPT 中的系统性偏见
大型语言模型为提高劳动生产率(如简化人员选拔)提供了巨大的潜力,但也引起了人们对其预训练数据中蕴含的系统性偏见长期存在的担忧。本研究探讨了聊天机器人 ChatGPT 在评估求职者时可能存在的种族和性别偏见。我使用社会科学中的对应审计方法,模拟了一个包含 34,560 个空缺职位和简历组合的简历筛选任务,聊天机器人必须对虚构的求职者资料进行评分。通过比较 ChatGPT 对阿拉伯人、亚洲人、美国黑人、中非人、荷兰人、东欧人、西班牙人、土耳其人和美国白人男性和女性求职者的评价,我发现种族和性别认同影响了聊天机器人的评价。种族歧视比性别歧视更明显,主要发生在劳动条件较好或需要较高语言能力的工作中。相比之下,性别偏见出现在性别典型的角色中。这些发现表明,ChatGPT 的歧视性输出反映了一种与社会刻板印象相呼应的统计机制。政策制定者和开发者应该解决语言模型驱动应用程序中的系统性偏见,以确保公平对待不同的人口群体。鉴于这些工具可能(重新)产生的不利影响,尤其是在涉及人类的选择决策中,从业人员应谨慎行事。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信