Quantifying the Gap: A Case Study of Wikidata Gender Disparities

C. Zhang, L. Terveen
{"title":"Quantifying the Gap: A Case Study of Wikidata Gender Disparities","authors":"C. Zhang, L. Terveen","doi":"10.1145/3479986.3479992","DOIUrl":null,"url":null,"abstract":"Much prior research has found gender bias in peer production systems like Wikipedia and OpenStreetMap. This bias affects both women’s participation in these platforms and content about women on these platforms. We investigated the gender content gap in Wikidata, where less than 22% of items that represent people are about women. We asked: what is the source of this bias? Specifically, does it originate from the actions of Wikidata editors or from external factors; that is, does it simply reflect existing real world gender bias? We conducted a quantitative case study that found: (i) the most popular categories of people included in Wikidata represent male-dominant professions, such as American football; (ii) within a selected set of professions where we could obtain gender distribution data, Wikidata is no more biased than the real world: men and women are included at similar percentages, and the quality of items representing men and women also is similar. We provide possible explanations for our findings and implications for addressing the Wikidata content gap.","PeriodicalId":159312,"journal":{"name":"Proceedings of the 17th International Symposium on Open Collaboration","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 17th International Symposium on Open Collaboration","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3479986.3479992","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

Much prior research has found gender bias in peer production systems like Wikipedia and OpenStreetMap. This bias affects both women’s participation in these platforms and content about women on these platforms. We investigated the gender content gap in Wikidata, where less than 22% of items that represent people are about women. We asked: what is the source of this bias? Specifically, does it originate from the actions of Wikidata editors or from external factors; that is, does it simply reflect existing real world gender bias? We conducted a quantitative case study that found: (i) the most popular categories of people included in Wikidata represent male-dominant professions, such as American football; (ii) within a selected set of professions where we could obtain gender distribution data, Wikidata is no more biased than the real world: men and women are included at similar percentages, and the quality of items representing men and women also is similar. We provide possible explanations for our findings and implications for addressing the Wikidata content gap.
量化差距:维基数据性别差异的案例研究
许多先前的研究发现,维基百科和开放地图等对等生产系统存在性别偏见。这种偏见既影响了女性在这些平台上的参与,也影响了这些平台上关于女性的内容。我们调查了维基数据中的性别内容差距,其中代表人物的项目中只有不到22%是关于女性的。我们问:这种偏见的来源是什么?具体来说,它是源于维基数据编者的行为还是外部因素;也就是说,它只是反映了现实世界中存在的性别偏见吗?我们进行了一项定量案例研究,发现:(i)维基数据中最受欢迎的人群类别代表了男性主导的职业,如美式足球;(ii)在我们可以获得性别分布数据的一组选定的职业中,维基数据并不比现实世界更有偏见:男性和女性以相似的百分比被包括在内,代表男性和女性的项目的质量也相似。我们为我们的发现和解决维基数据内容差距的含义提供了可能的解释。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信