{"title":"Handling Out-of-Vocabulary Words in Lexicons to Polarity Classification","authors":"Gabriel Nascimento, Fellipe Duarte, G. Guedes","doi":"10.1145/3274192.3274239","DOIUrl":null,"url":null,"abstract":"Emotions play an important role in the area of Human-Computer Interaction (HCI). Sentiment Analysis (SA) aims to detect these emotions in text and, some SA tasks use lexicons to infer valence polarity from a text. Moreover, attributes extracted from lexicons such as Wordnet and LIWC have widespread use in AS tasks. However, one of the major challenges in using these lexicons is the absence of words in the vocabulary given that these words may contain valuable information for the SA task and therefore cannot be discarded. This paper proposes a new algorithm, named IKLex, to infer features to out-of-vocabulary words of LIWC lexicons using word embeddings. The experiments carried out with IKLex present promising results when applying the state-of-art classifiers of the polarity classification task in two datasets with different languages: Brazilian Portuguese and English. There was an improvement of at least 1% in the F1 score of the evaluated classifiers.","PeriodicalId":314561,"journal":{"name":"Proceedings of the 17th Brazilian Symposium on Human Factors in Computing Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 17th Brazilian Symposium on Human Factors in Computing Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3274192.3274239","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Emotions play an important role in the area of Human-Computer Interaction (HCI). Sentiment Analysis (SA) aims to detect these emotions in text and, some SA tasks use lexicons to infer valence polarity from a text. Moreover, attributes extracted from lexicons such as Wordnet and LIWC have widespread use in AS tasks. However, one of the major challenges in using these lexicons is the absence of words in the vocabulary given that these words may contain valuable information for the SA task and therefore cannot be discarded. This paper proposes a new algorithm, named IKLex, to infer features to out-of-vocabulary words of LIWC lexicons using word embeddings. The experiments carried out with IKLex present promising results when applying the state-of-art classifiers of the polarity classification task in two datasets with different languages: Brazilian Portuguese and English. There was an improvement of at least 1% in the F1 score of the evaluated classifiers.