Abigail Z. Jacobs, Su Lin Blodgett, Solon Barocas, Hal Daumé, Hanna M. Wallach
{"title":"The meaning and measurement of bias: lessons from natural language processing","authors":"Abigail Z. Jacobs, Su Lin Blodgett, Solon Barocas, Hal Daumé, Hanna M. Wallach","doi":"10.1145/3351095.3375671","DOIUrl":null,"url":null,"abstract":"The recent interest in identifying and mitigating bias in computational systems has introduced a wide range of different---and occasionally incomparable---proposals for what constitutes bias in such systems. This tutorial introduces the language of measurement modeling from the quantitative social sciences as a framework for examining how social, organizational, and political values enter computational systems and unpacking the varied normative concerns operationalized in different techniques for measuring \"bias.\" We show that this framework helps to clarify the way unobservable theoretical constructs---such as \"creditworthiness,\" \"risk to society,\" or \"tweet toxicity\"---are turned into measurable quantities and how this process may introduce fairness-related harms. In particular, we demonstrate how to systematically assess the construct validity and reliability of these measurements to detect and characterize specific types of harms, which arise from mismatches between constructs and their operationalizations. We then take a critical look at existing approaches to examining \"bias\" in NLP models, ranging from work on embedding spaces to machine translation and hate speech detection. We show that measurement modeling can help uncover the implicit constructs that such work aims to capture when measuring \"bias.\" In so doing, we illustrate the limits of current \"debiasing\" techniques, which have obscured the specific harms whose measurements they implicitly aim to reduce. By introducing the language of measurement modeling, we provide the FAT* community with a framework for making explicit and testing assumptions about unobservable theoretical constructs embedded in computational systems, thereby clarifying and uniting our understandings of fairness-related harms.","PeriodicalId":377829,"journal":{"name":"Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency","volume":"72 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3351095.3375671","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19
Abstract
The recent interest in identifying and mitigating bias in computational systems has introduced a wide range of different---and occasionally incomparable---proposals for what constitutes bias in such systems. This tutorial introduces the language of measurement modeling from the quantitative social sciences as a framework for examining how social, organizational, and political values enter computational systems and unpacking the varied normative concerns operationalized in different techniques for measuring "bias." We show that this framework helps to clarify the way unobservable theoretical constructs---such as "creditworthiness," "risk to society," or "tweet toxicity"---are turned into measurable quantities and how this process may introduce fairness-related harms. In particular, we demonstrate how to systematically assess the construct validity and reliability of these measurements to detect and characterize specific types of harms, which arise from mismatches between constructs and their operationalizations. We then take a critical look at existing approaches to examining "bias" in NLP models, ranging from work on embedding spaces to machine translation and hate speech detection. We show that measurement modeling can help uncover the implicit constructs that such work aims to capture when measuring "bias." In so doing, we illustrate the limits of current "debiasing" techniques, which have obscured the specific harms whose measurements they implicitly aim to reduce. By introducing the language of measurement modeling, we provide the FAT* community with a framework for making explicit and testing assumptions about unobservable theoretical constructs embedded in computational systems, thereby clarifying and uniting our understandings of fairness-related harms.