Evaluating Gender Bias Transfer from Film Data

Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP) Pub Date : 1900-01-01 DOI:10.18653/v1/2022.gebnlp-1.24

Amanda Bertsch, Ashley Oh, Sanika Natu, Swetha Gangu, A. Black, Emma Strubell

{"title":"Evaluating Gender Bias Transfer from Film Data","authors":"Amanda Bertsch, Ashley Oh, Sanika Natu, Swetha Gangu, A. Black, Emma Strubell","doi":"10.18653/v1/2022.gebnlp-1.24","DOIUrl":null,"url":null,"abstract":"Films are a rich source of data for natural language processing. OpenSubtitles (Lison and Tiedemann, 2016) is a popular movie script dataset, used for training models for tasks such as machine translation and dialogue generation. However, movies often contain biases that reflect society at the time, and these biases may be introduced during pre-training and influence downstream models. We perform sentiment analysis on template infilling (Kurita et al., 2019) and the Sentence Embedding Association Test (May et al., 2019) to measure how BERT-based language models change after continued pre-training on OpenSubtitles. We consider gender bias as a primary motivating case for this analysis, while also measuring other social biases such as disability. We show that sentiment analysis on template infilling is not an effective measure of bias due to the rarity of disability and gender identifying tokens in the movie dialogue. We extend our analysis to a longitudinal study of bias in film dialogue over the last 110 years and find that continued pre-training on OpenSubtitles encodes additional bias into BERT. We show that BERT learns associations that reflect the biases and representation of each film era, suggesting that additional care must be taken when using historical data.","PeriodicalId":161909,"journal":{"name":"Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/2022.gebnlp-1.24","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Films are a rich source of data for natural language processing. OpenSubtitles (Lison and Tiedemann, 2016) is a popular movie script dataset, used for training models for tasks such as machine translation and dialogue generation. However, movies often contain biases that reflect society at the time, and these biases may be introduced during pre-training and influence downstream models. We perform sentiment analysis on template infilling (Kurita et al., 2019) and the Sentence Embedding Association Test (May et al., 2019) to measure how BERT-based language models change after continued pre-training on OpenSubtitles. We consider gender bias as a primary motivating case for this analysis, while also measuring other social biases such as disability. We show that sentiment analysis on template infilling is not an effective measure of bias due to the rarity of disability and gender identifying tokens in the movie dialogue. We extend our analysis to a longitudinal study of bias in film dialogue over the last 110 years and find that continued pre-training on OpenSubtitles encodes additional bias into BERT. We show that BERT learns associations that reflect the biases and representation of each film era, suggesting that additional care must be taken when using historical data.

查看原文本刊更多论文

从电影数据评估性别偏见转移

电影是自然语言处理的丰富数据来源。open字幕(Lison and Tiedemann, 2016)是一个流行的电影剧本数据集，用于训练机器翻译和对话生成等任务的模型。然而，电影往往包含反映当时社会的偏见，这些偏见可能在预训练期间引入并影响下游模型。我们对模板填充(Kurita et al.， 2019)和句子嵌入关联测试(May et al.， 2019)进行情感分析，以衡量基于bert的语言模型在open字幕上持续预训练后的变化。我们认为性别偏见是这一分析的主要动机，同时也衡量了其他社会偏见，如残疾。我们表明，由于电影对话中残疾和性别识别标记的罕见性，对模板填充的情感分析并不是一种有效的偏见测量。我们将分析扩展到对过去110年电影对白偏见的纵向研究，发现在open字幕上持续的预训练将额外的偏见编码到BERT中。我们表明BERT学习了反映每个电影时代的偏见和代表性的关联，这表明在使用历史数据时必须格外小心。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP)

自引率

0.00%

发文量