{"title":"Overcoming Transformer Fine-Tuning process to improve Twitter Sentiment Analysis for Spanish Dialects","authors":"Daniel Palomino","doi":"10.52591/lxai202012124","DOIUrl":null,"url":null,"abstract":"Is there an effective Spanish Sentiment Analysis algorithm? The aim of this paper is to answer this question. The task is challenging because there are several dialects for the Spanish Language. Thus, identically written words could have several meanings and polarities regarding Spanish speaking countries. To tackle this multidialect issue we rely on a transfer learning approach. To do so, we train a BERT language model to “transfer” general features of the Spanish language. Then, we fine-tune the language model to specific dialects. BERT is also used to generate contextual data augmentation aimed to prevent overfitting. Finally, we build the polarity classifier and propose a fine-tuning step using groups of layers. Our design choices allow us to achieve state-of-the-art results regarding multidialect benchmark datasets.","PeriodicalId":301818,"journal":{"name":"LatinX in AI at Neural Information Processing Systems Conference 2020","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"LatinX in AI at Neural Information Processing Systems Conference 2020","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.52591/lxai202012124","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Is there an effective Spanish Sentiment Analysis algorithm? The aim of this paper is to answer this question. The task is challenging because there are several dialects for the Spanish Language. Thus, identically written words could have several meanings and polarities regarding Spanish speaking countries. To tackle this multidialect issue we rely on a transfer learning approach. To do so, we train a BERT language model to “transfer” general features of the Spanish language. Then, we fine-tune the language model to specific dialects. BERT is also used to generate contextual data augmentation aimed to prevent overfitting. Finally, we build the polarity classifier and propose a fine-tuning step using groups of layers. Our design choices allow us to achieve state-of-the-art results regarding multidialect benchmark datasets.