BelElect: A New Dataset for Bias Research from a "Dark" Platform

Sviatlana Höhn, S. Mauw, Nicholas M. Asher
{"title":"BelElect: A New Dataset for Bias Research from a \"Dark\" Platform","authors":"Sviatlana Höhn, S. Mauw, Nicholas M. Asher","doi":"10.1609/icwsm.v16i1.19378","DOIUrl":null,"url":null,"abstract":"New social networks and platforms such as Telegram, Gab and Parler offer a stage for extremist, racist and aggressive content, but also provide a safe space for freedom fighters in authoritarian regimes. Data from such platforms offer excellent opportunities for research on issues such as linguistic bias and toxic language detection. However, only a few, mostly unannotated, English-only corpora from such platforms exist. This article presents a new Telegram corpus in Russian and Belorussian languages tailored for research on linguistic bias in political news. In addition, we created a repository to make all currently available corpora from so-called \"dark\" platforms accessible in one place.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Web and Social Media","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1609/icwsm.v16i1.19378","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

New social networks and platforms such as Telegram, Gab and Parler offer a stage for extremist, racist and aggressive content, but also provide a safe space for freedom fighters in authoritarian regimes. Data from such platforms offer excellent opportunities for research on issues such as linguistic bias and toxic language detection. However, only a few, mostly unannotated, English-only corpora from such platforms exist. This article presents a new Telegram corpus in Russian and Belorussian languages tailored for research on linguistic bias in political news. In addition, we created a repository to make all currently available corpora from so-called "dark" platforms accessible in one place.
BelElect:来自“黑暗”平台的偏见研究新数据集
新的社交网络和平台,如Telegram、Gab和Parler,为极端主义、种族主义和攻击性内容提供了舞台,但也为专制政权中的自由战士提供了安全空间。来自这些平台的数据为语言偏见和有毒语言检测等问题的研究提供了极好的机会。然而,只有少数,大多数没有注释,只有英语的语料库从这些平台存在。本文提出了一个新的电报语料库在俄罗斯和白俄罗斯语言量身定制的研究语言偏见的政治新闻。此外,我们创建了一个存储库,使所有来自所谓的“暗”平台的当前可用语料库都可以在一个地方访问。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信