通过词嵌入检测阿拉伯语文本中的性别偏见。

IF 2.6 3区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES
PLoS ONE Pub Date : 2025-03-31 eCollection Date: 2025-01-01 DOI:10.1371/journal.pone.0319301
Aya Mourad, Fatima K Abu Salem, Shady Elbassuoni
{"title":"通过词嵌入检测阿拉伯语文本中的性别偏见。","authors":"Aya Mourad, Fatima K Abu Salem, Shady Elbassuoni","doi":"10.1371/journal.pone.0319301","DOIUrl":null,"url":null,"abstract":"<p><p>For generations, women have fought to achieve equal rights with those of men. Many historians and social scientists examined this uphill path with a focus on women's rights and economic status in the West. Other parts of the world, such as the Middle East, remain understudied, with a noticeable shortage in gender-based statistics in the economic arena. According to the sociocognitive theory of critical discourse analysis, social behaviors and norms are reflected by language discourses, which motivates the present study, where we examine gender-based biases in various occupations, as reflected through various textual corpora. Several works in literature have shown that word embedding models can learn biases from the textual data they are trained on, which can propagate societal prejudices that have been implicitly embedded in such text. In our study, we adapt WEAT and Direct Bias quantification tests for Arabic, to examine gender bias with respect to a wide set of occupations as reflected in various Arabic text datasets. These datasets include two Lebanese news archives, Arabic Wikipedia, and electronic newspapers in UAE, Egypt, and Morocco, thus providing different outlooks into female and male engagements in various professions. Our WEAT tests across all datasets indicate that words related to careers, science, and intellectual pursuits are linked to men. In contrast, words related to family and art are associated with women across all datasets. The Direct Bias analysis shows a consistent female gender bias towards professions such as nurse, house cleaner, maid, secretary, and dancer. As the Moroccan News Articles Dataset (MNAD) showed, females were also associated with additional occupations such as researcher, doctor, and professor. Considering that the Arab world remains short on census data exploring gender-based disparities across various professions, our work provides evidence that such stereotypes persist till this day.</p>","PeriodicalId":20189,"journal":{"name":"PLoS ONE","volume":"20 3","pages":"e0319301"},"PeriodicalIF":2.6000,"publicationDate":"2025-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11957338/pdf/","citationCount":"0","resultStr":"{\"title\":\"Detecting gender bias in Arabic text through word embeddings.\",\"authors\":\"Aya Mourad, Fatima K Abu Salem, Shady Elbassuoni\",\"doi\":\"10.1371/journal.pone.0319301\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>For generations, women have fought to achieve equal rights with those of men. Many historians and social scientists examined this uphill path with a focus on women's rights and economic status in the West. Other parts of the world, such as the Middle East, remain understudied, with a noticeable shortage in gender-based statistics in the economic arena. According to the sociocognitive theory of critical discourse analysis, social behaviors and norms are reflected by language discourses, which motivates the present study, where we examine gender-based biases in various occupations, as reflected through various textual corpora. Several works in literature have shown that word embedding models can learn biases from the textual data they are trained on, which can propagate societal prejudices that have been implicitly embedded in such text. In our study, we adapt WEAT and Direct Bias quantification tests for Arabic, to examine gender bias with respect to a wide set of occupations as reflected in various Arabic text datasets. These datasets include two Lebanese news archives, Arabic Wikipedia, and electronic newspapers in UAE, Egypt, and Morocco, thus providing different outlooks into female and male engagements in various professions. Our WEAT tests across all datasets indicate that words related to careers, science, and intellectual pursuits are linked to men. In contrast, words related to family and art are associated with women across all datasets. The Direct Bias analysis shows a consistent female gender bias towards professions such as nurse, house cleaner, maid, secretary, and dancer. As the Moroccan News Articles Dataset (MNAD) showed, females were also associated with additional occupations such as researcher, doctor, and professor. Considering that the Arab world remains short on census data exploring gender-based disparities across various professions, our work provides evidence that such stereotypes persist till this day.</p>\",\"PeriodicalId\":20189,\"journal\":{\"name\":\"PLoS ONE\",\"volume\":\"20 3\",\"pages\":\"e0319301\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2025-03-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11957338/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PLoS ONE\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.1371/journal.pone.0319301\",\"RegionNum\":3,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLoS ONE","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1371/journal.pone.0319301","RegionNum":3,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

摘要

几代人以来,妇女一直在为实现与男性平等的权利而奋斗。许多历史学家和社会科学家研究了这条艰难的道路,重点关注西方国家的妇女权利和经济地位。世界其他地区,如中东地区,仍未得到充分研究,经济领域基于性别的统计数据明显不足。根据批判性话语分析的社会认知理论,社会行为和规范是通过语言话语反映出来的,这也是本研究的动机所在,我们将通过各种文本语料来研究各种职业中基于性别的偏见。一些文献表明,单词嵌入模型可以从它们所训练的文本数据中学习偏见,从而传播隐含在这些文本中的社会偏见。在我们的研究中,我们将 WEAT 和直接偏见量化测试应用于阿拉伯语,以检验各种阿拉伯语文本数据集中所反映的各种职业的性别偏见。这些数据集包括两个黎巴嫩新闻档案库、阿拉伯语维基百科以及阿联酋、埃及和摩洛哥的电子报纸,从而为女性和男性从事各种职业提供了不同的视角。我们对所有数据集进行的 WEAT 测试表明,与职业、科学和智力追求相关的词汇与男性有关。相比之下,在所有数据集中,与家庭和艺术相关的词汇都与女性有关。直接偏差分析表明,护士、家庭清洁工、女仆、秘书和舞蹈演员等职业始终存在女性性别偏差。摩洛哥新闻文章数据集(MNAD)显示,女性还与研究员、医生和教授等其他职业相关。考虑到阿拉伯世界仍然缺乏探索各种职业中性别差异的普查数据,我们的工作提供了证据,证明这种陈规定型观念至今仍然存在。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Detecting gender bias in Arabic text through word embeddings.

For generations, women have fought to achieve equal rights with those of men. Many historians and social scientists examined this uphill path with a focus on women's rights and economic status in the West. Other parts of the world, such as the Middle East, remain understudied, with a noticeable shortage in gender-based statistics in the economic arena. According to the sociocognitive theory of critical discourse analysis, social behaviors and norms are reflected by language discourses, which motivates the present study, where we examine gender-based biases in various occupations, as reflected through various textual corpora. Several works in literature have shown that word embedding models can learn biases from the textual data they are trained on, which can propagate societal prejudices that have been implicitly embedded in such text. In our study, we adapt WEAT and Direct Bias quantification tests for Arabic, to examine gender bias with respect to a wide set of occupations as reflected in various Arabic text datasets. These datasets include two Lebanese news archives, Arabic Wikipedia, and electronic newspapers in UAE, Egypt, and Morocco, thus providing different outlooks into female and male engagements in various professions. Our WEAT tests across all datasets indicate that words related to careers, science, and intellectual pursuits are linked to men. In contrast, words related to family and art are associated with women across all datasets. The Direct Bias analysis shows a consistent female gender bias towards professions such as nurse, house cleaner, maid, secretary, and dancer. As the Moroccan News Articles Dataset (MNAD) showed, females were also associated with additional occupations such as researcher, doctor, and professor. Considering that the Arab world remains short on census data exploring gender-based disparities across various professions, our work provides evidence that such stereotypes persist till this day.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
PLoS ONE
PLoS ONE 生物-生物学
CiteScore
6.20
自引率
5.40%
发文量
14242
审稿时长
3.7 months
期刊介绍: PLOS ONE is an international, peer-reviewed, open-access, online publication. PLOS ONE welcomes reports on primary research from any scientific discipline. It provides: * Open-access—freely accessible online, authors retain copyright * Fast publication times * Peer review by expert, practicing researchers * Post-publication tools to indicate quality and impact * Community-based dialogue on articles * Worldwide media coverage
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信