Antidiscrimination Laws, Artificial Intelligence, and Gender Bias: A Case Study in Nonmortgage Fintech Lending

Stephanie Kelley, Anton Ovchinnikov, D. Hardoon, Adrienne Heinrich
{"title":"Antidiscrimination Laws, Artificial Intelligence, and Gender Bias: A Case Study in Nonmortgage Fintech Lending","authors":"Stephanie Kelley, Anton Ovchinnikov, D. Hardoon, Adrienne Heinrich","doi":"10.1287/msom.2022.1108","DOIUrl":null,"url":null,"abstract":"Problem definition: We use a realistically large, publicly available data set from a global fintech lender to simulate the impact of different antidiscrimination laws and their corresponding data management and model-building regimes on gender-based discrimination in the nonmortgage fintech lending setting. Academic/practical relevance: Our paper extends the conceptual understanding of model-based discrimination from computer science to a realistic context that simulates the situations faced by fintech lenders in practice, where advanced machine learning (ML) techniques are used with high-dimensional, feature-rich, highly multicollinear data. We provide technically and legally permissible approaches for firms to reduce discrimination across different antidiscrimination regimes whilst managing profitability. Methodology: We train statistical and ML models on a large and realistically rich publicly available data set to simulate different antidiscrimination regimes and measure their impact on model quality and firm profitability. We use ML explainability techniques to understand the drivers of ML discrimination. Results: We find that regimes that prohibit the use of gender (like those in the United States) substantially increase discrimination and slightly decrease firm profitability. We observe that ML models are less discriminatory, of better predictive quality, and more profitable compared with traditional statistical models like logistic regression. Unlike omitted variable bias—which drives discrimination in statistical models—ML discrimination is driven by changes in the model training procedure, including feature engineering and feature selection, when gender is excluded. We observe that down sampling the training data to rebalance gender, gender-aware hyperparameter selection, and up sampling the training data to rebalance gender all reduce discrimination, with varying trade-offs in predictive quality and firm profitability. Probabilistic gender proxy modeling (imputing applicant gender) further reduces discrimination with negligible impact on predictive quality and a slight increase in firm profitability. Managerial implications: A rethink is required of the antidiscrimination laws, specifically with respect to the collection and use of protected attributes for ML models. Firms should be able to collect protected attributes to, at minimum, measure discrimination and ideally, take steps to reduce it. Increased data access should come with greater accountability for firms.","PeriodicalId":18108,"journal":{"name":"Manuf. Serv. Oper. Manag.","volume":"54 3 1","pages":"3039-3059"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Manuf. Serv. Oper. Manag.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1287/msom.2022.1108","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Problem definition: We use a realistically large, publicly available data set from a global fintech lender to simulate the impact of different antidiscrimination laws and their corresponding data management and model-building regimes on gender-based discrimination in the nonmortgage fintech lending setting. Academic/practical relevance: Our paper extends the conceptual understanding of model-based discrimination from computer science to a realistic context that simulates the situations faced by fintech lenders in practice, where advanced machine learning (ML) techniques are used with high-dimensional, feature-rich, highly multicollinear data. We provide technically and legally permissible approaches for firms to reduce discrimination across different antidiscrimination regimes whilst managing profitability. Methodology: We train statistical and ML models on a large and realistically rich publicly available data set to simulate different antidiscrimination regimes and measure their impact on model quality and firm profitability. We use ML explainability techniques to understand the drivers of ML discrimination. Results: We find that regimes that prohibit the use of gender (like those in the United States) substantially increase discrimination and slightly decrease firm profitability. We observe that ML models are less discriminatory, of better predictive quality, and more profitable compared with traditional statistical models like logistic regression. Unlike omitted variable bias—which drives discrimination in statistical models—ML discrimination is driven by changes in the model training procedure, including feature engineering and feature selection, when gender is excluded. We observe that down sampling the training data to rebalance gender, gender-aware hyperparameter selection, and up sampling the training data to rebalance gender all reduce discrimination, with varying trade-offs in predictive quality and firm profitability. Probabilistic gender proxy modeling (imputing applicant gender) further reduces discrimination with negligible impact on predictive quality and a slight increase in firm profitability. Managerial implications: A rethink is required of the antidiscrimination laws, specifically with respect to the collection and use of protected attributes for ML models. Firms should be able to collect protected attributes to, at minimum, measure discrimination and ideally, take steps to reduce it. Increased data access should come with greater accountability for firms.
反歧视法、人工智能和性别偏见:非抵押金融科技贷款的案例研究
问题定义:我们使用来自全球金融科技贷款机构的实际大型公开数据集来模拟不同反歧视法律及其相应的数据管理和模型构建机制对非抵押金融科技贷款环境中基于性别的歧视的影响。学术/实践相关性:我们的论文将基于模型的歧视的概念理解从计算机科学扩展到模拟金融科技贷方在实践中面临的情况的现实背景,其中先进的机器学习(ML)技术用于高维,特征丰富,高度多重共线性的数据。我们为企业提供技术和法律上允许的方法,以减少不同反歧视制度下的歧视,同时管理盈利能力。方法:我们在一个庞大而丰富的公开数据集上训练统计和机器学习模型,以模拟不同的反歧视制度,并衡量它们对模型质量和公司盈利能力的影响。我们使用机器学习可解释性技术来理解机器学习歧视的驱动因素。结果:我们发现,禁止使用性别的制度(如美国的制度)大大增加了歧视,并略微降低了公司的盈利能力。我们观察到,与传统的统计模型(如逻辑回归)相比,ML模型具有更少的歧视性,更好的预测质量,并且更有利可图。与遗漏的变量偏差不同——它会导致统计模型中的歧视——当性别被排除在外时,机器学习歧视是由模型训练过程中的变化驱动的,包括特征工程和特征选择。我们观察到,训练数据的下采样以重新平衡性别,性别意识超参数选择和训练数据的上采样以重新平衡性别都减少了歧视,在预测质量和公司盈利能力方面进行了不同的权衡。概率性别代理模型(输入申请人性别)进一步减少了歧视,对预测质量的影响可以忽略不计,对公司盈利能力的影响略有增加。管理意义:需要重新考虑反歧视法,特别是关于ML模型的受保护属性的收集和使用。公司应该能够收集受保护的属性,至少可以衡量歧视,理想情况下,可以采取措施减少歧视。在增加数据访问的同时,企业应该承担更大的责任。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信