The meaning and measurement of bias: lessons from natural language processing

Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency Pub Date : 2020-01-22 DOI:10.1145/3351095.3375671

Abigail Z. Jacobs, Su Lin Blodgett, Solon Barocas, Hal Daumé, Hanna M. Wallach

{"title":"The meaning and measurement of bias: lessons from natural language processing","authors":"Abigail Z. Jacobs, Su Lin Blodgett, Solon Barocas, Hal Daumé, Hanna M. Wallach","doi":"10.1145/3351095.3375671","DOIUrl":null,"url":null,"abstract":"The recent interest in identifying and mitigating bias in computational systems has introduced a wide range of different---and occasionally incomparable---proposals for what constitutes bias in such systems. This tutorial introduces the language of measurement modeling from the quantitative social sciences as a framework for examining how social, organizational, and political values enter computational systems and unpacking the varied normative concerns operationalized in different techniques for measuring \"bias.\" We show that this framework helps to clarify the way unobservable theoretical constructs---such as \"creditworthiness,\" \"risk to society,\" or \"tweet toxicity\"---are turned into measurable quantities and how this process may introduce fairness-related harms. In particular, we demonstrate how to systematically assess the construct validity and reliability of these measurements to detect and characterize specific types of harms, which arise from mismatches between constructs and their operationalizations. We then take a critical look at existing approaches to examining \"bias\" in NLP models, ranging from work on embedding spaces to machine translation and hate speech detection. We show that measurement modeling can help uncover the implicit constructs that such work aims to capture when measuring \"bias.\" In so doing, we illustrate the limits of current \"debiasing\" techniques, which have obscured the specific harms whose measurements they implicitly aim to reduce. By introducing the language of measurement modeling, we provide the FAT* community with a framework for making explicit and testing assumptions about unobservable theoretical constructs embedded in computational systems, thereby clarifying and uniting our understandings of fairness-related harms.","PeriodicalId":377829,"journal":{"name":"Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency","volume":"72 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3351095.3375671","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 19

Abstract

The recent interest in identifying and mitigating bias in computational systems has introduced a wide range of different---and occasionally incomparable---proposals for what constitutes bias in such systems. This tutorial introduces the language of measurement modeling from the quantitative social sciences as a framework for examining how social, organizational, and political values enter computational systems and unpacking the varied normative concerns operationalized in different techniques for measuring "bias." We show that this framework helps to clarify the way unobservable theoretical constructs---such as "creditworthiness," "risk to society," or "tweet toxicity"---are turned into measurable quantities and how this process may introduce fairness-related harms. In particular, we demonstrate how to systematically assess the construct validity and reliability of these measurements to detect and characterize specific types of harms, which arise from mismatches between constructs and their operationalizations. We then take a critical look at existing approaches to examining "bias" in NLP models, ranging from work on embedding spaces to machine translation and hate speech detection. We show that measurement modeling can help uncover the implicit constructs that such work aims to capture when measuring "bias." In so doing, we illustrate the limits of current "debiasing" techniques, which have obscured the specific harms whose measurements they implicitly aim to reduce. By introducing the language of measurement modeling, we provide the FAT* community with a framework for making explicit and testing assumptions about unobservable theoretical constructs embedded in computational systems, thereby clarifying and uniting our understandings of fairness-related harms.

查看原文本刊更多论文

偏见的意义和测量:来自自然语言处理的经验教训

最近对识别和减轻计算系统中的偏见的兴趣已经引入了广泛的不同的——有时是无与伦比的——关于在这些系统中什么构成偏见的建议。本教程介绍了定量社会科学的测量建模语言，作为一个框架，用于检查社会、组织和政治价值是如何进入计算系统的，并揭示了在测量“偏见”的不同技术中操作的各种规范性问题。我们表明，这个框架有助于澄清不可观察的理论结构——如“信誉”、“社会风险”或“推特毒性”——转化为可测量的数量的方式，以及这个过程如何可能引入与公平相关的危害。特别是，我们展示了如何系统地评估这些测量的结构效度和可靠性，以检测和表征特定类型的危害，这些危害是由结构与其操作之间的不匹配引起的。然后，我们对NLP模型中检查“偏见”的现有方法进行了批判性的研究，从嵌入空间的工作到机器翻译和仇恨言论检测。我们表明，测量建模可以帮助揭示这种工作的目的是在测量“偏差”时捕获的隐式结构。在这样做的过程中，我们说明了当前“去偏”技术的局限性，这些技术掩盖了它们隐含地旨在减少的具体危害。通过引入测量建模语言，我们为FAT*社区提供了一个框架，用于明确和测试嵌入计算系统中不可观察的理论结构的假设，从而澄清和统一我们对公平相关危害的理解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency

自引率

0.00%

发文量