{"title":"Reducing sentiment polarity for demographic attributes in word embeddings using adversarial learning","authors":"Chris Sweeney, M. Najafian","doi":"10.1145/3351095.3372837","DOIUrl":null,"url":null,"abstract":"The use of word embedding models in sentiment analysis has gained a lot of traction in the Natural Language Processing (NLP) community. However, many inherently neutral word vectors describing demographic identity have unintended implicit correlations with negative or positive sentiment, resulting in unfair downstream machine learning algorithms. We leverage adversarial learning to decorrelate demographic identity term word vectors with positive or negative sentiment, and re-embed them into the word embeddings. We show that our method effectively minimizes unfair positive/negative sentiment polarity while retaining the semantic accuracy of the word embeddings. Furthermore, we show that our method effectively reduces unfairness in downstream sentiment regression and can be extended to reduce unfairness in toxicity classification tasks.","PeriodicalId":377829,"journal":{"name":"Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency","volume":"54 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"33","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3351095.3372837","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 33
Abstract
The use of word embedding models in sentiment analysis has gained a lot of traction in the Natural Language Processing (NLP) community. However, many inherently neutral word vectors describing demographic identity have unintended implicit correlations with negative or positive sentiment, resulting in unfair downstream machine learning algorithms. We leverage adversarial learning to decorrelate demographic identity term word vectors with positive or negative sentiment, and re-embed them into the word embeddings. We show that our method effectively minimizes unfair positive/negative sentiment polarity while retaining the semantic accuracy of the word embeddings. Furthermore, we show that our method effectively reduces unfairness in downstream sentiment regression and can be extended to reduce unfairness in toxicity classification tasks.