基于变换的阿拉伯语攻击性语音检测

Saja Al-Dabet, A. Elmassry, Ban AlOmar, Abdullah Alshamsi
{"title":"基于变换的阿拉伯语攻击性语音检测","authors":"Saja Al-Dabet, A. Elmassry, Ban AlOmar, Abdullah Alshamsi","doi":"10.1109/ESCI56872.2023.10100134","DOIUrl":null,"url":null,"abstract":"The prevalence of social media platforms prompted detecting any language that is intended to harm or intimidate another person or group of people in online posts and comments. On Twitter, for instance, users are susceptible to cyberbullying and hate speech, which may develop into physical and psychological violence. A transformer-based approach is presented in this study to address the offensive speech detection issue. This model employs versions of the CAMeLBERT model and is validated using a mixture of four benchmark Twitter Arabic datasets annotated for hate speech detection task, including the (OSACT5 2022) workshop shared task dataset. The presented model was capable of recognizing Arabic tweets containing offensive speech with 87.15 % accuracy and 83.6 % F1 score.","PeriodicalId":441215,"journal":{"name":"2023 International Conference on Emerging Smart Computing and Informatics (ESCI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Transformer-based Arabic Offensive Speech Detection\",\"authors\":\"Saja Al-Dabet, A. Elmassry, Ban AlOmar, Abdullah Alshamsi\",\"doi\":\"10.1109/ESCI56872.2023.10100134\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The prevalence of social media platforms prompted detecting any language that is intended to harm or intimidate another person or group of people in online posts and comments. On Twitter, for instance, users are susceptible to cyberbullying and hate speech, which may develop into physical and psychological violence. A transformer-based approach is presented in this study to address the offensive speech detection issue. This model employs versions of the CAMeLBERT model and is validated using a mixture of four benchmark Twitter Arabic datasets annotated for hate speech detection task, including the (OSACT5 2022) workshop shared task dataset. The presented model was capable of recognizing Arabic tweets containing offensive speech with 87.15 % accuracy and 83.6 % F1 score.\",\"PeriodicalId\":441215,\"journal\":{\"name\":\"2023 International Conference on Emerging Smart Computing and Informatics (ESCI)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 International Conference on Emerging Smart Computing and Informatics (ESCI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ESCI56872.2023.10100134\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference on Emerging Smart Computing and Informatics (ESCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ESCI56872.2023.10100134","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

社交媒体平台的盛行促使人们在网络帖子和评论中发现任何意在伤害或恐吓他人或群体的语言。例如,在Twitter上,用户很容易受到网络欺凌和仇恨言论的影响,这可能会发展成身体和心理暴力。本研究提出了一种基于变换的方法来解决攻击性语音检测问题。该模型采用CAMeLBERT模型的版本,并使用为仇恨言论检测任务注释的四个基准Twitter阿拉伯语数据集的混合进行验证,包括(OSACT5 2022)研讨会共享任务数据集。所提出的模型能够识别含有攻击性言论的阿拉伯语推文,准确率为87.15%,F1得分为83.6%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Transformer-based Arabic Offensive Speech Detection
The prevalence of social media platforms prompted detecting any language that is intended to harm or intimidate another person or group of people in online posts and comments. On Twitter, for instance, users are susceptible to cyberbullying and hate speech, which may develop into physical and psychological violence. A transformer-based approach is presented in this study to address the offensive speech detection issue. This model employs versions of the CAMeLBERT model and is validated using a mixture of four benchmark Twitter Arabic datasets annotated for hate speech detection task, including the (OSACT5 2022) workshop shared task dataset. The presented model was capable of recognizing Arabic tweets containing offensive speech with 87.15 % accuracy and 83.6 % F1 score.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信