Victoria Pachón Álvarez, Jacinto Mata Vázquez, José Manuel López Betanzos, José Luis Arjona Fernández
{"title":"I2C at SemEval-2020 Task 12: Simple but Effective Approaches to Offensive Speech Detection in Twitter","authors":"Victoria Pachón Álvarez, Jacinto Mata Vázquez, José Manuel López Betanzos, José Luis Arjona Fernández","doi":"10.18653/v1/2020.semeval-1.259","DOIUrl":"https://doi.org/10.18653/v1/2020.semeval-1.259","url":null,"abstract":"This paper describes the systems developed for I2C Group to participate on Subtasks A and B in English, and Subtask A in Turkish and Arabic in OffensEval (Task 12 of SemEval 2020). In our experiments we compare three architectures we have developed, two based on Transformer and the other based on classical machine learning algorithms. In this paper, the proposed architectures are described, and the results obtained by our systems are presented.","PeriodicalId":207482,"journal":{"name":"Proceedings of the Fourteenth Workshop on Semantic Evaluation","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131045786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Flor Miriam Plaza del Arco, M. Dolores Molina González, Alfonso Ureña-López, Maite Martin
{"title":"SINAI at SemEval-2020 Task 12: Offensive Language Identification Exploring Transfer Learning Models","authors":"Flor Miriam Plaza del Arco, M. Dolores Molina González, Alfonso Ureña-López, Maite Martin","doi":"10.18653/v1/2020.semeval-1.211","DOIUrl":"https://doi.org/10.18653/v1/2020.semeval-1.211","url":null,"abstract":"This paper describes the participation of SINAI team at Task 12: OffensEval 2: Multilingual Offensive Language Identification in Social Media. In particular, the participation in Sub-task A in English which consists of identifying tweets as offensive or not offensive. We preprocess the dataset according to the language characteristics used on social media. Then, we select a small set from the training set provided by the organizers and fine-tune different Transformerbased models in order to test their effectiveness. Our team ranks 20th out of 85 participants in Subtask-A using the XLNet model.","PeriodicalId":207482,"journal":{"name":"Proceedings of the Fourteenth Workshop on Semantic Evaluation","volume":"40 24","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131721014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SSN_NLP at SemEval-2020 Task 7: Detecting Funniness Level Using Traditional Learning with Sentence Embeddings","authors":"K. S., T. D., Aravindan Chandrabose","doi":"10.18653/v1/2020.semeval-1.109","DOIUrl":"https://doi.org/10.18653/v1/2020.semeval-1.109","url":null,"abstract":"Assessing the funniness of edited news headlines task deals with estimating the humorness in the headlines edited with micro-edits. This task has two sub-tasks in which one has to calculate the mean predicted score of humor level and other deals with predicting the best funnier sentence among given two sentences. We have calculated the humorness level using microtc and predicted the funnier sentence using microtc, universal sentence encoder classifier, many other traditional classifiers that use the vectors formed with universal sentence encoder embeddings, sentence embeddings and majority algorithm within these approaches. Among these approaches, microtc with 6 folds, 24 processes and 3 folds, 36 processes achieve the least Root Mean Square Error for development and test set respectively for subtask 1. For subtask 2, Universal sentence encoder classifier achieves the highest accuracy for development set and Multi-Layer Perceptron applied on vectors vectorized using universal sentence encoder embeddings for the test set.","PeriodicalId":207482,"journal":{"name":"Proceedings of the Fourteenth Workshop on Semantic Evaluation","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128527638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}