Undesirable Biases in NLP: Addressing Challenges of Measurement

IF 5.4 3区材料科学 Q2 CHEMISTRY, PHYSICAL

ACS Applied Energy Materials Pub Date : 2024-01-10 DOI:10.1613/jair.1.15195

Oskar van der Wal, Dominik Bachmann, Alina Leidinger, Leendert van Maanen, Willem Zuidema, Katrin Schulz

{"title":"Undesirable Biases in NLP: Addressing Challenges of Measurement","authors":"Oskar van der Wal, Dominik Bachmann, Alina Leidinger, Leendert van Maanen, Willem Zuidema, Katrin Schulz","doi":"10.1613/jair.1.15195","DOIUrl":null,"url":null,"abstract":"As Large Language Models and Natural Language Processing (NLP) technology rapidly develop and spread into daily life, it becomes crucial to anticipate how their use could harm people. One problem that has received a lot of attention in recent years is that this technology has displayed harmful biases, from generating derogatory stereotypes to producing disparate outcomes for different social groups. Although a lot of effort has been invested in assessing and mitigating these biases, our methods of measuring the biases of NLP models have serious problems and it is often unclear what they actually measure. In this paper, we provide an interdisciplinary approach to discussing the issue of NLP model bias by adopting the lens of psychometrics — a field specialized in the measurement of concepts like bias that are not directly observable. In particular, we will explore two central notions from psychometrics, the construct validity and the reliability of measurement tools, and discuss how they can be applied in the context of measuring model bias. Our goal is to provide NLP practitioners with methodological tools for designing better bias measures, and to inspire them more generally to explore tools from psychometrics when working on bias measurement tools.\nThis article appears in the AI & Society track.","PeriodicalId":4,"journal":{"name":"ACS Applied Energy Materials","volume":"60 22","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Energy Materials","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1613/jair.1.15195","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}

引用次数: 0

Abstract

As Large Language Models and Natural Language Processing (NLP) technology rapidly develop and spread into daily life, it becomes crucial to anticipate how their use could harm people. One problem that has received a lot of attention in recent years is that this technology has displayed harmful biases, from generating derogatory stereotypes to producing disparate outcomes for different social groups. Although a lot of effort has been invested in assessing and mitigating these biases, our methods of measuring the biases of NLP models have serious problems and it is often unclear what they actually measure. In this paper, we provide an interdisciplinary approach to discussing the issue of NLP model bias by adopting the lens of psychometrics — a field specialized in the measurement of concepts like bias that are not directly observable. In particular, we will explore two central notions from psychometrics, the construct validity and the reliability of measurement tools, and discuss how they can be applied in the context of measuring model bias. Our goal is to provide NLP practitioners with methodological tools for designing better bias measures, and to inspire them more generally to explore tools from psychometrics when working on bias measurement tools. This article appears in the AI & Society track.

查看原文本刊更多论文

NLP 中的不良偏差：应对测量挑战

随着大型语言模型和自然语言处理（NLP）技术的快速发展和在日常生活中的普及，预测其使用可能对人们造成的伤害变得至关重要。近年来备受关注的一个问题是，这种技术显示出有害的偏见，从产生贬损性刻板印象到对不同社会群体产生不同的结果。尽管我们在评估和减轻这些偏见方面投入了大量精力，但我们衡量 NLP 模型偏见的方法存在严重问题，而且往往不清楚这些方法究竟衡量了什么。在本文中，我们将采用跨学科的方法，从心理测量学的角度来讨论 NLP 模型的偏差问题，心理测量学是一个专门测量偏差等无法直接观察到的概念的领域。特别是，我们将探讨心理测量学的两个核心概念，即测量工具的构造有效性和可靠性，并讨论如何将它们应用于模型偏差的测量。我们的目标是为 NLP 从业人员提供设计更好的偏差测量方法的方法论工具，并激励他们在开发偏差测量工具时更广泛地探索心理测量学中的工具。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACS Applied Energy Materials Materials Science-Materials Chemistry

CiteScore

10.30

自引率

6.20%

发文量

1368

期刊介绍： ACS Applied Energy Materials is an interdisciplinary journal publishing original research covering all aspects of materials, engineering, chemistry, physics and biology relevant to energy conversion and storage. The journal is devoted to reports of new and original experimental and theoretical research of an applied nature that integrate knowledge in the areas of materials, engineering, physics, bioscience, and chemistry into important energy applications.