Junheng He , Nankai Lin , Qifeng Bai , Haoyu Liang , Dong Zhou , Aimin Yang
{"title":"Towards fair decision: A novel representation method for debiasing pre-trained models","authors":"Junheng He , Nankai Lin , Qifeng Bai , Haoyu Liang , Dong Zhou , Aimin Yang","doi":"10.1016/j.dss.2024.114208","DOIUrl":null,"url":null,"abstract":"<div><p>Pretrained language models (PLMs) are frequently employed in Decision Support Systems (DSSs) due to their strong performance. However, recent studies have revealed that these PLMs can exhibit social biases, leading to unfair decisions that harm vulnerable groups. Sensitive information contained in sentences from training data is the primary source of bias. Previously proposed debiasing methods based on contrastive disentanglement have proven highly effective. In these methods, PLMs can disentangle sensitive information from non-sensitive information in sentence embedding, and then adapts non-sensitive information only for downstream tasks. Such approaches hinge on having good sentence embedding as input. However, recent research found that most non-fine-tuned PLMs such as BERT produce poor sentence embedding. Disentangling based on these embedding will lead to unsatisfactory debiasing results. Taking a finer-grained perspective, we propose PCFR (Prompt and Contrastive-based Fair Representation), a novel disentanglement method integrating prompt and contrastive learning to debias PLMs. We employ prompt learning to represent information as sensitive embedding and subsequently apply contrastive learning to contrast these information embedding rather than the sentence embedding. PCFR encourages similarity among different non-sensitive information embedding and dissimilarity between sensitive and non-sensitive information embedding. We mitigate gender and religion biases in two prominent PLMs, namely BERT and GPT-2. To comprehensively assess debiasing efficacy of PCFR, we employ multiple fairness metrics. Experimental results consistently demonstrate the superior performance of PCFR compared to representative baseline methods. Additionally, when applied to specific downstream decision tasks, PCFR not only shows strong de-biasing capability but also significantly preserves task performance.</p></div>","PeriodicalId":55181,"journal":{"name":"Decision Support Systems","volume":"181 ","pages":"Article 114208"},"PeriodicalIF":6.7000,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Decision Support Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167923624000411","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Pretrained language models (PLMs) are frequently employed in Decision Support Systems (DSSs) due to their strong performance. However, recent studies have revealed that these PLMs can exhibit social biases, leading to unfair decisions that harm vulnerable groups. Sensitive information contained in sentences from training data is the primary source of bias. Previously proposed debiasing methods based on contrastive disentanglement have proven highly effective. In these methods, PLMs can disentangle sensitive information from non-sensitive information in sentence embedding, and then adapts non-sensitive information only for downstream tasks. Such approaches hinge on having good sentence embedding as input. However, recent research found that most non-fine-tuned PLMs such as BERT produce poor sentence embedding. Disentangling based on these embedding will lead to unsatisfactory debiasing results. Taking a finer-grained perspective, we propose PCFR (Prompt and Contrastive-based Fair Representation), a novel disentanglement method integrating prompt and contrastive learning to debias PLMs. We employ prompt learning to represent information as sensitive embedding and subsequently apply contrastive learning to contrast these information embedding rather than the sentence embedding. PCFR encourages similarity among different non-sensitive information embedding and dissimilarity between sensitive and non-sensitive information embedding. We mitigate gender and religion biases in two prominent PLMs, namely BERT and GPT-2. To comprehensively assess debiasing efficacy of PCFR, we employ multiple fairness metrics. Experimental results consistently demonstrate the superior performance of PCFR compared to representative baseline methods. Additionally, when applied to specific downstream decision tasks, PCFR not only shows strong de-biasing capability but also significantly preserves task performance.
期刊介绍:
The common thread of articles published in Decision Support Systems is their relevance to theoretical and technical issues in the support of enhanced decision making. The areas addressed may include foundations, functionality, interfaces, implementation, impacts, and evaluation of decision support systems (DSSs).