Mitigating social biases of pre-trained language models via contrastive self-debiasing with double data augmentation

IF 5.1 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Artificial Intelligence Pub Date : 2024-04-26 DOI:10.1016/j.artint.2024.104143

Yingji Li , Mengnan Du , Rui Song , Xin Wang , Mingchen Sun , Ying Wang

{"title":"Mitigating social biases of pre-trained language models via contrastive self-debiasing with double data augmentation","authors":"Yingji Li , Mengnan Du , Rui Song , Xin Wang , Mingchen Sun , Ying Wang","doi":"10.1016/j.artint.2024.104143","DOIUrl":null,"url":null,"abstract":"<div>Pre-trained Language Models (PLMs) have been shown to inherit and even amplify the social biases contained in the training corpus, leading to undesired stereotype in real-world applications. Existing techniques for mitigating the social biases of PLMs mainly rely on data augmentation with manually designed prior knowledge or fine-tuning with abundant external corpora to debias. However, these methods are not only limited by artificial experience, but also consume a lot of resources to access all the parameters of the PLMs and are prone to introduce new external biases when fine-tuning with external corpora. In this paper, we propose a Contrastive Self-Debiasing Model with Double Data Augmentation (named CD3) for mitigating social biases of PLMs. Specifically, CD3 consists of two stages: double data augmentation and contrastive self-debiasing. First, we build on counterfactual data augmentation to perform a secondary augmentation using biased prompts that are automatically searched by maximizing the differences in PLMs' encoding across demographic groups. Double data augmentation further amplifies the biases between sample pairs to break the limitations of previous debiasing models that heavily rely on prior knowledge in data augmentation. We then leverage the augmented data for contrastive learning to train a plug-and-play adapter to mitigate the social biases in PLMs' encoding without tuning the PLMs. Extensive experimental results on BERT, ALBERT, and RoBERTa on several real-world datasets and fairness metrics show that CD3 outperforms baseline models on gender debiasing and race debiasing while retaining the language modeling capabilities of PLMs.</div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"332 ","pages":"Article 104143"},"PeriodicalIF":5.1000,"publicationDate":"2024-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0004370224000791","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Pre-trained Language Models (PLMs) have been shown to inherit and even amplify the social biases contained in the training corpus, leading to undesired stereotype in real-world applications. Existing techniques for mitigating the social biases of PLMs mainly rely on data augmentation with manually designed prior knowledge or fine-tuning with abundant external corpora to debias. However, these methods are not only limited by artificial experience, but also consume a lot of resources to access all the parameters of the PLMs and are prone to introduce new external biases when fine-tuning with external corpora. In this paper, we propose a Contrastive Self-Debiasing Model with Double Data Augmentation (named CD³) for mitigating social biases of PLMs. Specifically, CD³ consists of two stages: double data augmentation and contrastive self-debiasing. First, we build on counterfactual data augmentation to perform a secondary augmentation using biased prompts that are automatically searched by maximizing the differences in PLMs' encoding across demographic groups. Double data augmentation further amplifies the biases between sample pairs to break the limitations of previous debiasing models that heavily rely on prior knowledge in data augmentation. We then leverage the augmented data for contrastive learning to train a plug-and-play adapter to mitigate the social biases in PLMs' encoding without tuning the PLMs. Extensive experimental results on BERT, ALBERT, and RoBERTa on several real-world datasets and fairness metrics show that CD³ outperforms baseline models on gender debiasing and race debiasing while retaining the language modeling capabilities of PLMs.

查看原文本刊更多论文

通过双重数据增强的对比性自我消除，减轻预训练语言模型的社会偏见

事实证明，预训练语言模型（PLMs）会继承甚至放大训练语料库中的社会偏见，从而在实际应用中产生不受欢迎的刻板印象。现有的减轻 PLMs 社会偏见的技术主要依赖于利用人工设计的先验知识进行数据扩充，或利用丰富的外部语料库进行微调来消除偏见。然而，这些方法不仅受到人工经验的限制，而且需要消耗大量资源来获取 PLM 的所有参数，并且在使用外部语料进行微调时容易引入新的外部偏差。在本文中，我们提出了一种具有双重数据增强功能的对比自消除模型（名为 CD3），用于减轻 PLM 的社会偏见。具体来说，CD3 包括两个阶段：双重数据增强和对比性自我纠错。首先，我们在反事实数据增强的基础上，使用有偏见的提示进行二次增强，这些提示是通过最大化不同人口群体中 PLM 编码的差异而自动搜索的。双重数据扩增进一步放大了样本对之间的偏差，从而打破了以往数据扩增严重依赖先验知识的去除法模型的局限性。然后，我们利用增强数据进行对比学习，训练即插即用适配器，以减轻 PLM 编码中的社会偏差，而无需调整 PLM。在 BERT、ALBERT 和 RoBERTa 上对多个真实数据集和公平性指标进行的大量实验结果表明，CD3 在性别去重和种族去重方面优于基线模型，同时保留了 PLM 的语言建模能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Artificial Intelligence 工程技术-计算机：人工智能

CiteScore

11.20

自引率

1.40%

发文量

118

审稿时长

8 months

期刊介绍： The Journal of Artificial Intelligence (AIJ) welcomes papers covering a broad spectrum of AI topics, including cognition, automated reasoning, computer vision, machine learning, and more. Papers should demonstrate advancements in AI and propose innovative approaches to AI problems. Additionally, the journal accepts papers describing AI applications, focusing on how new methods enhance performance rather than reiterating conventional approaches. In addition to regular papers, AIJ also accepts Research Notes, Research Field Reviews, Position Papers, Book Reviews, and summary papers on AI challenges and competitions.