Improving Group Fairness Assessments with Proxies

Emma Harvey, M. S. Lee, Jatinder Singh
{"title":"Improving Group Fairness Assessments with Proxies","authors":"Emma Harvey, M. S. Lee, Jatinder Singh","doi":"10.1145/3677175","DOIUrl":null,"url":null,"abstract":"\n Although algorithms are increasingly used to guide real-world decision-making, their potential for propagating bias remains challenging to measure. A common approach for researchers and practitioners examining algorithms for unintended discriminatory biases is to assess group fairness, which compares outcomes across typically sensitive or protected demographic features like race, gender, or age. In practice, however, data representing these group attributes is often not collected, or may be unavailable due to policy, legal, or other constraints. As a result, practitioners often find themselves tasked with assessing fairness in the face of these missing features. In such cases, they can either forgo a bias audit, obtain the missing data directly, or impute it. Because obtaining additional data is often prohibitively expensive or raises privacy concerns, many practitioners attempt to impute missing data using proxies. Through a survey of the data used in algorithmic fairness literature, which we make public to facilitate future research, we show that when available at all, most publicly available proxy sources are in the form of\n summary tables\n , which contain only aggregate statistics about a population. Prior work has found that these proxies are not predictive enough on their own to accurately measure group fairness. Even proxy variables that are correlated with group attributes also contain noise (i.e. will predict attributes for a subset of the population effectively at random).\n \n \n Here, we outline a method for improving accuracy in measuring group fairness using summary tables. Specifically, we propose improving accuracy by focusing only on\n highly predictive values\n within proxy variables, and outline the conditions under which these proxies can estimate fairness disparities with high accuracy. We then show that a major disqualifying criterion—an association between the proxy and the outcome—can be controlled for using causal inference. Finally, we show that when proxy data is missing altogether, our approach is applicable to rule-based proxies constructed using subject-matter context applied to the original data alone. Crucially, we are able to extract information on group disparities from proxies that may have low discriminatory power at the population level. We illustrate our results through a variety of case studies with real and simulated data. In all, we present a viable method allowing the assessment of fairness in the face of missing data, with limited privacy implications and without needing to rely on complex, expensive, or proprietary data sources.\n","PeriodicalId":486991,"journal":{"name":"ACM Journal on Responsible Computing","volume":"6 11","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Journal on Responsible Computing","FirstCategoryId":"0","ListUrlMain":"https://doi.org/10.1145/3677175","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Although algorithms are increasingly used to guide real-world decision-making, their potential for propagating bias remains challenging to measure. A common approach for researchers and practitioners examining algorithms for unintended discriminatory biases is to assess group fairness, which compares outcomes across typically sensitive or protected demographic features like race, gender, or age. In practice, however, data representing these group attributes is often not collected, or may be unavailable due to policy, legal, or other constraints. As a result, practitioners often find themselves tasked with assessing fairness in the face of these missing features. In such cases, they can either forgo a bias audit, obtain the missing data directly, or impute it. Because obtaining additional data is often prohibitively expensive or raises privacy concerns, many practitioners attempt to impute missing data using proxies. Through a survey of the data used in algorithmic fairness literature, which we make public to facilitate future research, we show that when available at all, most publicly available proxy sources are in the form of summary tables , which contain only aggregate statistics about a population. Prior work has found that these proxies are not predictive enough on their own to accurately measure group fairness. Even proxy variables that are correlated with group attributes also contain noise (i.e. will predict attributes for a subset of the population effectively at random). Here, we outline a method for improving accuracy in measuring group fairness using summary tables. Specifically, we propose improving accuracy by focusing only on highly predictive values within proxy variables, and outline the conditions under which these proxies can estimate fairness disparities with high accuracy. We then show that a major disqualifying criterion—an association between the proxy and the outcome—can be controlled for using causal inference. Finally, we show that when proxy data is missing altogether, our approach is applicable to rule-based proxies constructed using subject-matter context applied to the original data alone. Crucially, we are able to extract information on group disparities from proxies that may have low discriminatory power at the population level. We illustrate our results through a variety of case studies with real and simulated data. In all, we present a viable method allowing the assessment of fairness in the face of missing data, with limited privacy implications and without needing to rely on complex, expensive, or proprietary data sources.
用代理改进小组公平性评估
尽管算法越来越多地被用于指导现实世界的决策,但其传播偏见的可能性仍然难以衡量。研究人员和从业人员在检查算法是否存在无意的歧视性偏见时,常用的方法是评估群体公平性,即比较不同典型敏感或受保护人口特征(如种族、性别或年龄)的结果。但在实践中,代表这些群体属性的数据往往没有收集到,或者由于政策、法律或其他限制而无法获得。因此,实践者经常发现自己的任务是在这些特征缺失的情况下评估公平性。在这种情况下,他们要么放弃偏差审计,要么直接获取缺失数据,要么估算缺失数据。由于获取额外数据的成本往往过高,或者会引起隐私方面的担忧,因此许多从业人员尝试使用替代数据来估算缺失数据。通过对算法公平性文献中使用的数据进行调查(我们公开这些数据以促进未来的研究),我们发现,如果可以获得这些数据,大多数公开的代理数据源都是汇总表的形式,其中只包含有关人口的总体统计数据。先前的研究发现,这些替代变量本身的预测能力不足以准确衡量群体公平性。即使是与群体属性相关的代用变量,也含有噪声(即会随机有效地预测人口子集的属性)。 在此,我们概述了一种利用汇总表提高群体公平性测量准确性的方法。具体来说,我们建议通过只关注替代变量中的高预测值来提高准确性,并概述了这些替代变量能够高精度估计公平性差异的条件。然后,我们展示了一个主要的不合格标准--代理变量与结果之间的关联--可以通过因果推理加以控制。最后,我们证明,当代理数据完全缺失时,我们的方法适用于使用主题背景构建的基于规则的代理,并仅适用于原始数据。最重要的是,我们能够从在群体层面上歧视力可能较低的代理数据中提取群体差异信息。我们通过使用真实数据和模拟数据进行各种案例研究来说明我们的成果。总之,我们提出了一种可行的方法,可以在数据缺失的情况下评估公平性,对隐私影响有限,而且无需依赖复杂、昂贵或专有的数据源。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信