BATED:通过有偏见的教师引导的解纠缠来学习预训练语言模型的公平表示

IF 4.6 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Yingji Li , Mengnan Du , Rui Song , Mu Liu , Ying Wang
{"title":"BATED:通过有偏见的教师引导的解纠缠来学习预训练语言模型的公平表示","authors":"Yingji Li ,&nbsp;Mengnan Du ,&nbsp;Rui Song ,&nbsp;Mu Liu ,&nbsp;Ying Wang","doi":"10.1016/j.artint.2025.104401","DOIUrl":null,"url":null,"abstract":"<div><div>With the rapid development of Pre-trained Language Models (PLMs) and their widespread deployment in various real-world applications, social biases of PLMs have attracted increasing attention, especially the fairness of downstream tasks, which potentially affects the development and stability of society. Among existing debiasing methods, intrinsic debiasing methods are not necessarily effective when applied to downstream tasks, and the downstream fine-tuning process may introduce new biases or catastrophic forgetting. Most extrinsic debiasing methods rely on sensitive attribute words as prior knowledge to supervise debiasing training. However, it is difficult to collect sensitive attribute information of real data due to privacy and regulation. Moreover, limited sensitive attribute words may lead to inadequate debiasing training. To this end, this paper proposes a debiasing method to learn fair representation for PLMs via <strong>B</strong>i<strong>A</strong>sed <strong>TE</strong>acher-guided <strong>D</strong>isentanglement (called <strong>BATED</strong>). Specific to downstream tasks, BATED performs debiasing training under the guidance of a biased teacher model rather than relying on sensitive attribute information of the training data. First, we leverage causal contrastive learning to train a task-agnostic general biased teacher model. We then employ Variational Auto-Encoder (VAE) to disentangle the PLM-encoded representation into the fair representation and the biased representation. The Biased representation is further decoupled via biased teacher-guided disentanglement, while the fair representation learn downstream tasks. Therefore, BATED guarantees the performance of downstream tasks while improving the fairness. Experimental results on seven PLMs testing three downstream tasks demonstrate that BATED outperforms the state-of-the-art overall in terms of fairness and performance on downstream tasks.</div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"348 ","pages":"Article 104401"},"PeriodicalIF":4.6000,"publicationDate":"2025-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"BATED: Learning fair representation for Pre-trained Language Models via biased teacher-guided disentanglement\",\"authors\":\"Yingji Li ,&nbsp;Mengnan Du ,&nbsp;Rui Song ,&nbsp;Mu Liu ,&nbsp;Ying Wang\",\"doi\":\"10.1016/j.artint.2025.104401\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>With the rapid development of Pre-trained Language Models (PLMs) and their widespread deployment in various real-world applications, social biases of PLMs have attracted increasing attention, especially the fairness of downstream tasks, which potentially affects the development and stability of society. Among existing debiasing methods, intrinsic debiasing methods are not necessarily effective when applied to downstream tasks, and the downstream fine-tuning process may introduce new biases or catastrophic forgetting. Most extrinsic debiasing methods rely on sensitive attribute words as prior knowledge to supervise debiasing training. However, it is difficult to collect sensitive attribute information of real data due to privacy and regulation. Moreover, limited sensitive attribute words may lead to inadequate debiasing training. To this end, this paper proposes a debiasing method to learn fair representation for PLMs via <strong>B</strong>i<strong>A</strong>sed <strong>TE</strong>acher-guided <strong>D</strong>isentanglement (called <strong>BATED</strong>). Specific to downstream tasks, BATED performs debiasing training under the guidance of a biased teacher model rather than relying on sensitive attribute information of the training data. First, we leverage causal contrastive learning to train a task-agnostic general biased teacher model. We then employ Variational Auto-Encoder (VAE) to disentangle the PLM-encoded representation into the fair representation and the biased representation. The Biased representation is further decoupled via biased teacher-guided disentanglement, while the fair representation learn downstream tasks. Therefore, BATED guarantees the performance of downstream tasks while improving the fairness. Experimental results on seven PLMs testing three downstream tasks demonstrate that BATED outperforms the state-of-the-art overall in terms of fairness and performance on downstream tasks.</div></div>\",\"PeriodicalId\":8434,\"journal\":{\"name\":\"Artificial Intelligence\",\"volume\":\"348 \",\"pages\":\"Article 104401\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2025-08-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Artificial Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0004370225001201\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0004370225001201","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

随着预训练语言模型(Pre-trained Language Models, PLMs)的快速发展和在各种现实应用中的广泛应用,PLMs的社会偏见越来越受到人们的关注,尤其是下游任务的公平性问题,它可能会影响社会的发展和稳定。在现有的去偏方法中,内在去偏方法在应用于下游任务时不一定有效,下游微调过程可能会引入新的偏差或灾难性遗忘。大多数外在去偏方法依赖敏感属性词作为先验知识来监督去偏训练。然而,由于隐私和监管的原因,难以收集到真实数据的敏感属性信息。此外,有限的敏感属性词可能导致去偏训练不足。为此,本文提出了一种通过有偏见的教师引导的解纠缠(BATED)来学习plm公平表示的去偏见方法。针对下游任务,BATED在偏向教师模型的指导下进行去偏向训练,而不是依赖于训练数据的敏感属性信息。首先,我们利用因果对比学习来训练一个任务不可知论的一般偏见教师模型。然后,我们使用变分自编码器(VAE)将plm编码表示分解为公平表示和偏见表示。有偏见的表示通过有偏见的教师引导的解纠缠进一步解耦,而公平表示学习下游任务。因此,BATED在保证下游任务性能的同时,提高了公平性。在七个plm测试三个下游任务的实验结果表明,BATED在下游任务的公平性和性能方面总体上优于最先进的技术。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
BATED: Learning fair representation for Pre-trained Language Models via biased teacher-guided disentanglement
With the rapid development of Pre-trained Language Models (PLMs) and their widespread deployment in various real-world applications, social biases of PLMs have attracted increasing attention, especially the fairness of downstream tasks, which potentially affects the development and stability of society. Among existing debiasing methods, intrinsic debiasing methods are not necessarily effective when applied to downstream tasks, and the downstream fine-tuning process may introduce new biases or catastrophic forgetting. Most extrinsic debiasing methods rely on sensitive attribute words as prior knowledge to supervise debiasing training. However, it is difficult to collect sensitive attribute information of real data due to privacy and regulation. Moreover, limited sensitive attribute words may lead to inadequate debiasing training. To this end, this paper proposes a debiasing method to learn fair representation for PLMs via BiAsed TEacher-guided Disentanglement (called BATED). Specific to downstream tasks, BATED performs debiasing training under the guidance of a biased teacher model rather than relying on sensitive attribute information of the training data. First, we leverage causal contrastive learning to train a task-agnostic general biased teacher model. We then employ Variational Auto-Encoder (VAE) to disentangle the PLM-encoded representation into the fair representation and the biased representation. The Biased representation is further decoupled via biased teacher-guided disentanglement, while the fair representation learn downstream tasks. Therefore, BATED guarantees the performance of downstream tasks while improving the fairness. Experimental results on seven PLMs testing three downstream tasks demonstrate that BATED outperforms the state-of-the-art overall in terms of fairness and performance on downstream tasks.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Artificial Intelligence
Artificial Intelligence 工程技术-计算机:人工智能
CiteScore
11.20
自引率
1.40%
发文量
118
审稿时长
8 months
期刊介绍: The Journal of Artificial Intelligence (AIJ) welcomes papers covering a broad spectrum of AI topics, including cognition, automated reasoning, computer vision, machine learning, and more. Papers should demonstrate advancements in AI and propose innovative approaches to AI problems. Additionally, the journal accepts papers describing AI applications, focusing on how new methods enhance performance rather than reiterating conventional approaches. In addition to regular papers, AIJ also accepts Research Notes, Research Field Reviews, Position Papers, Book Reviews, and summary papers on AI challenges and competitions.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信