M. Karrabi, Leila Oskooie, M. Bakhtiar, Mohammad Farahani, R. Monsefi
{"title":"基于嵌入非正式词和基于注意的LSTM网络的非正式波斯语文本情感分析","authors":"M. Karrabi, Leila Oskooie, M. Bakhtiar, Mohammad Farahani, R. Monsefi","doi":"10.1109/CFIS49607.2020.9238699","DOIUrl":null,"url":null,"abstract":"The massive volume of comments on websites and social networks has made it possible to raise awareness of people's beliefs and preferences regarding products and services on a large scale. For this purpose, sentiment analysis, which refers to the determination of the sentiment of texts, has been proposed as an intelligent solution. From a methodological point of view, the recent combination of words embedding and deep neural networks (DNNs) has become an effective approach for sentiment analysis. In Persian studies, formal corpuses such as Wikipedia dumps have been used for word embedding. The fundamental difference between formal and informal texts means that the vectors derived from formal texts in informal contexts such as social networks do not result in desirable accuracy. To overcome this drawback, in this paper, we provide a large integrated text corpus of several different sources of informal comments and we also utilize the Fasttext as the word embedding algorithm. In this research, we use Attention-based LSTM, which has been shown to perform more effectively compared to the similar methods in sentiment analysis for the English language. The proposed method is evaluated on the two Persian “Taaghche” and “Filimo” datasets collected in this paper. The experiments on the two Persian datasets prove that utilizing informal vectors in sentiment analysis and applying the attention model improves the prediction accuracy of the DNN in the sentiment analysis of Persian texts.","PeriodicalId":128323,"journal":{"name":"2020 8th Iranian Joint Congress on Fuzzy and intelligent Systems (CFIS)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Sentiment Analysis of Informal Persian Texts Using Embedding Informal words and Attention-Based LSTM Network\",\"authors\":\"M. Karrabi, Leila Oskooie, M. Bakhtiar, Mohammad Farahani, R. Monsefi\",\"doi\":\"10.1109/CFIS49607.2020.9238699\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The massive volume of comments on websites and social networks has made it possible to raise awareness of people's beliefs and preferences regarding products and services on a large scale. For this purpose, sentiment analysis, which refers to the determination of the sentiment of texts, has been proposed as an intelligent solution. From a methodological point of view, the recent combination of words embedding and deep neural networks (DNNs) has become an effective approach for sentiment analysis. In Persian studies, formal corpuses such as Wikipedia dumps have been used for word embedding. The fundamental difference between formal and informal texts means that the vectors derived from formal texts in informal contexts such as social networks do not result in desirable accuracy. To overcome this drawback, in this paper, we provide a large integrated text corpus of several different sources of informal comments and we also utilize the Fasttext as the word embedding algorithm. In this research, we use Attention-based LSTM, which has been shown to perform more effectively compared to the similar methods in sentiment analysis for the English language. The proposed method is evaluated on the two Persian “Taaghche” and “Filimo” datasets collected in this paper. The experiments on the two Persian datasets prove that utilizing informal vectors in sentiment analysis and applying the attention model improves the prediction accuracy of the DNN in the sentiment analysis of Persian texts.\",\"PeriodicalId\":128323,\"journal\":{\"name\":\"2020 8th Iranian Joint Congress on Fuzzy and intelligent Systems (CFIS)\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 8th Iranian Joint Congress on Fuzzy and intelligent Systems (CFIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CFIS49607.2020.9238699\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 8th Iranian Joint Congress on Fuzzy and intelligent Systems (CFIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CFIS49607.2020.9238699","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Sentiment Analysis of Informal Persian Texts Using Embedding Informal words and Attention-Based LSTM Network
The massive volume of comments on websites and social networks has made it possible to raise awareness of people's beliefs and preferences regarding products and services on a large scale. For this purpose, sentiment analysis, which refers to the determination of the sentiment of texts, has been proposed as an intelligent solution. From a methodological point of view, the recent combination of words embedding and deep neural networks (DNNs) has become an effective approach for sentiment analysis. In Persian studies, formal corpuses such as Wikipedia dumps have been used for word embedding. The fundamental difference between formal and informal texts means that the vectors derived from formal texts in informal contexts such as social networks do not result in desirable accuracy. To overcome this drawback, in this paper, we provide a large integrated text corpus of several different sources of informal comments and we also utilize the Fasttext as the word embedding algorithm. In this research, we use Attention-based LSTM, which has been shown to perform more effectively compared to the similar methods in sentiment analysis for the English language. The proposed method is evaluated on the two Persian “Taaghche” and “Filimo” datasets collected in this paper. The experiments on the two Persian datasets prove that utilizing informal vectors in sentiment analysis and applying the attention model improves the prediction accuracy of the DNN in the sentiment analysis of Persian texts.