Anirban Mukherjee, S. Mukhopadhyay, P. Panigrahi, Saptarsi Goswami
{"title":"利用过采样方法对亚马逊评论数据集进行多类情感分析","authors":"Anirban Mukherjee, S. Mukhopadhyay, P. Panigrahi, Saptarsi Goswami","doi":"10.1109/ICAwST.2019.8923260","DOIUrl":null,"url":null,"abstract":"Sentiment Analysis is a major element in Artificial Intelligence. Its applications include machine translation, text analysis, computational linguistics, etc. In most cases, classification of sentiment is done into two or three classes. But in some situations, for example rating a product from Amazon, there are multiple classes. One major challenge in such tasks is the class imbalance which reduces the accuracy by making the model biased. To deal with this problem, we use oversampling to reduce the class imbalance of the dataset before training the model. In this research work, first we use variations of recurrent neural networks, such as simple RNN, GRU, LSTM and Bidirectional LSTM, to find out which model performs the best in multi-class classification of sentiment. Then, we use that model to understand the effect of oversampling a dataset before using it to train a model.","PeriodicalId":156538,"journal":{"name":"2019 IEEE 10th International Conference on Awareness Science and Technology (iCAST)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"Utilization of Oversampling for multiclass sentiment analysis on Amazon Review Dataset\",\"authors\":\"Anirban Mukherjee, S. Mukhopadhyay, P. Panigrahi, Saptarsi Goswami\",\"doi\":\"10.1109/ICAwST.2019.8923260\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sentiment Analysis is a major element in Artificial Intelligence. Its applications include machine translation, text analysis, computational linguistics, etc. In most cases, classification of sentiment is done into two or three classes. But in some situations, for example rating a product from Amazon, there are multiple classes. One major challenge in such tasks is the class imbalance which reduces the accuracy by making the model biased. To deal with this problem, we use oversampling to reduce the class imbalance of the dataset before training the model. In this research work, first we use variations of recurrent neural networks, such as simple RNN, GRU, LSTM and Bidirectional LSTM, to find out which model performs the best in multi-class classification of sentiment. Then, we use that model to understand the effect of oversampling a dataset before using it to train a model.\",\"PeriodicalId\":156538,\"journal\":{\"name\":\"2019 IEEE 10th International Conference on Awareness Science and Technology (iCAST)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE 10th International Conference on Awareness Science and Technology (iCAST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICAwST.2019.8923260\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 10th International Conference on Awareness Science and Technology (iCAST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAwST.2019.8923260","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Utilization of Oversampling for multiclass sentiment analysis on Amazon Review Dataset
Sentiment Analysis is a major element in Artificial Intelligence. Its applications include machine translation, text analysis, computational linguistics, etc. In most cases, classification of sentiment is done into two or three classes. But in some situations, for example rating a product from Amazon, there are multiple classes. One major challenge in such tasks is the class imbalance which reduces the accuracy by making the model biased. To deal with this problem, we use oversampling to reduce the class imbalance of the dataset before training the model. In this research work, first we use variations of recurrent neural networks, such as simple RNN, GRU, LSTM and Bidirectional LSTM, to find out which model performs the best in multi-class classification of sentiment. Then, we use that model to understand the effect of oversampling a dataset before using it to train a model.