在SemEval-2019中针对移民和女性的仇恨言论的多语言检测任务5:仇恨语音检测中的频率分析插值

Proceedings of the 13th International Workshop on Semantic Evaluation Pub Date : 1900-01-01 DOI:10.18653/v1/S19-2081

Óscar Garibo i Orts

{"title":"在SemEval-2019中针对移民和女性的仇恨言论的多语言检测任务5:仇恨语音检测中的频率分析插值","authors":"Óscar Garibo i Orts","doi":"10.18653/v1/S19-2081","DOIUrl":null,"url":null,"abstract":"This document describes a text change of representation approach to the task of Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter, as part of SemEval-2019 1 . The task is divided in two sub-tasks. Sub-task A consists in classifying tweets as being hateful or not hateful, whereas sub-task B requires fine tuning the classification by classifying the hateful tweets as being directed to single individuals or generic, if the tweet is aggressive or not. Our approach consists of a change of the space of representation of text into statistical descriptors which characterize the text. In addition, dimensional reduction is performed to 6 characteristics per class in order to make the method suitable for a Big Data environment. Frequency Analysis Interpolation (FAI) is the approach we use to achieve rank 5th in Spanish language and 9th in English language in sub-task B in both cases.","PeriodicalId":109174,"journal":{"name":"Proceedings of the 13th International Workshop on Semantic Evaluation","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter at SemEval-2019 Task 5: Frequency Analysis Interpolation for Hate in Speech Detection\",\"authors\":\"Óscar Garibo i Orts\",\"doi\":\"10.18653/v1/S19-2081\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This document describes a text change of representation approach to the task of Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter, as part of SemEval-2019 1 . The task is divided in two sub-tasks. Sub-task A consists in classifying tweets as being hateful or not hateful, whereas sub-task B requires fine tuning the classification by classifying the hateful tweets as being directed to single individuals or generic, if the tweet is aggressive or not. Our approach consists of a change of the space of representation of text into statistical descriptors which characterize the text. In addition, dimensional reduction is performed to 6 characteristics per class in order to make the method suitable for a Big Data environment. Frequency Analysis Interpolation (FAI) is the approach we use to achieve rank 5th in Spanish language and 9th in English language in sub-task B in both cases.\",\"PeriodicalId\":109174,\"journal\":{\"name\":\"Proceedings of the 13th International Workshop on Semantic Evaluation\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 13th International Workshop on Semantic Evaluation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18653/v1/S19-2081\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 13th International Workshop on Semantic Evaluation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/S19-2081","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

作为SemEval-2019 1的一部分，本文件描述了针对Twitter中针对移民和女性的仇恨言论的多语言检测任务的文本变更表示方法。该任务分为两个子任务。子任务A包括将推文分类为可恨或不可恨，而子任务B需要通过将可恨推文分类为针对单个人或一般人(如果推文是否具有攻击性)来微调分类。我们的方法包括将文本的表示空间改变为表征文本的统计描述符。此外，为了使该方法适用于大数据环境，还对每个类进行了6个特征的降维。频率分析插值(FAI)是我们使用的方法，在两种情况下，在子任务B中，我们在西班牙语中获得第5名，在英语中获得第9名。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter at SemEval-2019 Task 5: Frequency Analysis Interpolation for Hate in Speech Detection

This document describes a text change of representation approach to the task of Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter, as part of SemEval-2019 1 . The task is divided in two sub-tasks. Sub-task A consists in classifying tweets as being hateful or not hateful, whereas sub-task B requires fine tuning the classification by classifying the hateful tweets as being directed to single individuals or generic, if the tweet is aggressive or not. Our approach consists of a change of the space of representation of text into statistical descriptors which characterize the text. In addition, dimensional reduction is performed to 6 characteristics per class in order to make the method suitable for a Big Data environment. Frequency Analysis Interpolation (FAI) is the approach we use to achieve rank 5th in Spanish language and 9th in English language in sub-task B in both cases.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 13th International Workshop on Semantic Evaluation

自引率

0.00%

发文量