Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP)最新文献

Indigenous Language Revitalization and the Dilemma of Gender Bias 本土语言复兴与性别偏见困境

Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP) Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.gebnlp-1.25

Oussama Hansal, N. Le, F. Sadat

{"title":"Indigenous Language Revitalization and the Dilemma of Gender Bias","authors":"Oussama Hansal, N. Le, F. Sadat","doi":"10.18653/v1/2022.gebnlp-1.25","DOIUrl":"https://doi.org/10.18653/v1/2022.gebnlp-1.25","url":null,"abstract":"Natural Language Processing (NLP), through its several applications, has been considered as one of the most valuable field in interdisciplinary researches, as well as in computer science. However, it is not without its flaws. One of the most common flaws is bias. This paper examines the main linguistic challenges of Inuktitut, an indigenous language of Canada, and focuses on gender bias identification and mitigation. We explore the unique characteristics of this language to help us understand the right techniques that can be used to identify and mitigate implicit biases. We use some methods to quantify the gender bias existing in Inuktitut word embeddings; then we proceed to mitigate the bias and evaluate the performance of the debiased embeddings. Next, we explain how approaches for detecting and reducing bias in English embeddings may be transferred to Inuktitut embeddings by properly taking into account the language’s particular characteristics. Next, we compare the effect of the debiasing techniques on Inuktitut and English. Finally, we highlight some future research directions which will further help to push the boundaries.","PeriodicalId":161909,"journal":{"name":"Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123987999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

HeteroCorpus: A Corpus for Heteronormative Language Detection 异质语料库:异质规范语言检测的语料库

Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP) Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.gebnlp-1.23

Juan Vásquez, G. Bel-Enguix, Scott Andersen, Sergio-Luis Ojeda-Trueba

引用次数: 2

A Taxonomy of Bias-Causing Ambiguities in Machine Translation 机器翻译中引起偏差的歧义分类

Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP) Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.gebnlp-1.18

M. Mechura

引用次数: 4

Uncertainty and Inclusivity in Gender Bias Annotation: An Annotation Taxonomy and Annotated Datasets of British English Text 性别偏见注释中的不确定性和包容性:英国英语文本注释分类和注释数据集

Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP) Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.gebnlp-1.4

Lucy Havens, B. Alex, Benjamin Bach, Melissa Mhairi Terras

{"title":"Uncertainty and Inclusivity in Gender Bias Annotation: An Annotation Taxonomy and Annotated Datasets of British English Text","authors":"Lucy Havens, B. Alex, Benjamin Bach, Melissa Mhairi Terras","doi":"10.18653/v1/2022.gebnlp-1.4","DOIUrl":"https://doi.org/10.18653/v1/2022.gebnlp-1.4","url":null,"abstract":"Mitigating harms from gender biased language in Natural Language Processing (NLP) systems remains a challenge, and the situated nature of language means bias is inescapable in NLP data. Though efforts to mitigate gender bias in NLP are numerous, they often vaguely define gender and bias, only consider two genders, and do not incorporate uncertainty into models. To address these limitations, in this paper we present a taxonomy of gender biased language and apply it to create annotated datasets. We created the taxonomy and annotated data with the aim of making gender bias in language transparent. If biases are communicated clearly, varieties of biased language can be better identified and measured. Our taxonomy contains eleven types of gender biases inclusive of people whose gender expressions do not fit into the binary conceptions of woman and man, and whose gender differs from that they were assigned at birth, while also allowing annotators to document unknown gender information. The taxonomy and annotated data will, in future work, underpin analysis and more equitable language model development.","PeriodicalId":161909,"journal":{"name":"Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129222082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Why Knowledge Distillation Amplifies Gender Bias and How to Mitigate from the Perspective of DistilBERT 从蒸馏酒的角度看知识蒸馏为何会放大性别偏见及如何缓解

Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP) Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.gebnlp-1.27

Jaimeen Ahn, Hwaran Lee, Jinhwa Kim, Alice Oh

引用次数: 9

Incorporating Subjectivity into Gendered Ambiguous Pronoun (GAP) Resolution using Style Transfer 运用风格迁移将主体性融入性别歧义代词消解

Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP) Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.gebnlp-1.28

Kartikey Pant, Tanvi Dadu

引用次数: 2

Analysis of Gender Bias in Social Perception and Judgement Using Chinese Word Embeddings 基于中文词嵌入的社会感知与判断中的性别偏见分析

Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP) Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.gebnlp-1.2

Jiali Li, Shucheng Zhu, Ying Liu, Pengyuan Liu

引用次数: 2

Evaluating Gender Bias Transfer from Film Data 从电影数据评估性别偏见转移

Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP) Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.gebnlp-1.24

Amanda Bertsch, Ashley Oh, Sanika Natu, Swetha Gangu, A. Black, Emma Strubell

{"title":"Evaluating Gender Bias Transfer from Film Data","authors":"Amanda Bertsch, Ashley Oh, Sanika Natu, Swetha Gangu, A. Black, Emma Strubell","doi":"10.18653/v1/2022.gebnlp-1.24","DOIUrl":"https://doi.org/10.18653/v1/2022.gebnlp-1.24","url":null,"abstract":"Films are a rich source of data for natural language processing. OpenSubtitles (Lison and Tiedemann, 2016) is a popular movie script dataset, used for training models for tasks such as machine translation and dialogue generation. However, movies often contain biases that reflect society at the time, and these biases may be introduced during pre-training and influence downstream models. We perform sentiment analysis on template infilling (Kurita et al., 2019) and the Sentence Embedding Association Test (May et al., 2019) to measure how BERT-based language models change after continued pre-training on OpenSubtitles. We consider gender bias as a primary motivating case for this analysis, while also measuring other social biases such as disability. We show that sentiment analysis on template infilling is not an effective measure of bias due to the rarity of disability and gender identifying tokens in the movie dialogue. We extend our analysis to a longitudinal study of bias in film dialogue over the last 110 years and find that continued pre-training on OpenSubtitles encodes additional bias into BERT. We show that BERT learns associations that reflect the biases and representation of each film era, suggesting that additional care must be taken when using historical data.","PeriodicalId":161909,"journal":{"name":"Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116715689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

On Gender Biases in Offensive Language Classification Models 论攻击性语言分类模型中的性别偏见

Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP) Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.gebnlp-1.19

Sanjana Marcé, Adam Poliak

引用次数: 1

An Empirical Study on the Fairness of Pre-trained Word Embeddings 预训练词嵌入公平性的实证研究

Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP) Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.gebnlp-1.15

E. Sesari, Max Hort, Federica Sarro

引用次数: 3