H. Kumar, B. Harish, S. V. A. Kumar, Manjunath Aradhya
{"title":"短文本情感分类:一种使用mSMTP度量的方法","authors":"H. Kumar, B. Harish, S. V. A. Kumar, Manjunath Aradhya","doi":"10.1145/3184066.3184074","DOIUrl":null,"url":null,"abstract":"Sentiment analysis or opinion mining is an automated process to recognize opinion, moods, emotions, attitude of individuals or communities through natural language processing, text analysis, and computational linguistics. In recent years, many studies concentrated on numerous blogs, tweets, forums and consumer review websites to identify sentiment of the communities. The information retrieved from social networking site will be in short informal text because of limited characters in blogging site or consumer review websites. Sentiment analysis in short-text is a challenging task, due to limitation of characters, user tends to shorten his/her conversation, which leads to misspellings, slang terms and shortened forms of words. Moreover, short-texts consists of more number of presence and absence of term/feature compared to regular text. In this work, our major goal is to classify sentiments into positive, negative or neutral polarity using new similarity measure. The proposed method embeds modified Similarity Measure for Text Processing (mSMTP) with K-Nearest Neighbor (KNN) classifier. The effectiveness of the proposed method is evaluated by comparing with Euclidean Distance, Cosine Similarity, Jaccard Coefficient and Correlation Coefficient. The proposed method is also compared with other classifiers like Support Vector Machine and Random Forest using benchmark dataset. The classification results are evaluated based on Accuracy, Precision, Recall and F-measure.","PeriodicalId":109559,"journal":{"name":"International Conference on Machine Learning and Soft Computing","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Classification of sentiments in short-text: an approach using mSMTP measure\",\"authors\":\"H. Kumar, B. Harish, S. V. A. Kumar, Manjunath Aradhya\",\"doi\":\"10.1145/3184066.3184074\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sentiment analysis or opinion mining is an automated process to recognize opinion, moods, emotions, attitude of individuals or communities through natural language processing, text analysis, and computational linguistics. In recent years, many studies concentrated on numerous blogs, tweets, forums and consumer review websites to identify sentiment of the communities. The information retrieved from social networking site will be in short informal text because of limited characters in blogging site or consumer review websites. Sentiment analysis in short-text is a challenging task, due to limitation of characters, user tends to shorten his/her conversation, which leads to misspellings, slang terms and shortened forms of words. Moreover, short-texts consists of more number of presence and absence of term/feature compared to regular text. In this work, our major goal is to classify sentiments into positive, negative or neutral polarity using new similarity measure. The proposed method embeds modified Similarity Measure for Text Processing (mSMTP) with K-Nearest Neighbor (KNN) classifier. The effectiveness of the proposed method is evaluated by comparing with Euclidean Distance, Cosine Similarity, Jaccard Coefficient and Correlation Coefficient. The proposed method is also compared with other classifiers like Support Vector Machine and Random Forest using benchmark dataset. The classification results are evaluated based on Accuracy, Precision, Recall and F-measure.\",\"PeriodicalId\":109559,\"journal\":{\"name\":\"International Conference on Machine Learning and Soft Computing\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-02-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Machine Learning and Soft Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3184066.3184074\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Machine Learning and Soft Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3184066.3184074","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Classification of sentiments in short-text: an approach using mSMTP measure
Sentiment analysis or opinion mining is an automated process to recognize opinion, moods, emotions, attitude of individuals or communities through natural language processing, text analysis, and computational linguistics. In recent years, many studies concentrated on numerous blogs, tweets, forums and consumer review websites to identify sentiment of the communities. The information retrieved from social networking site will be in short informal text because of limited characters in blogging site or consumer review websites. Sentiment analysis in short-text is a challenging task, due to limitation of characters, user tends to shorten his/her conversation, which leads to misspellings, slang terms and shortened forms of words. Moreover, short-texts consists of more number of presence and absence of term/feature compared to regular text. In this work, our major goal is to classify sentiments into positive, negative or neutral polarity using new similarity measure. The proposed method embeds modified Similarity Measure for Text Processing (mSMTP) with K-Nearest Neighbor (KNN) classifier. The effectiveness of the proposed method is evaluated by comparing with Euclidean Distance, Cosine Similarity, Jaccard Coefficient and Correlation Coefficient. The proposed method is also compared with other classifiers like Support Vector Machine and Random Forest using benchmark dataset. The classification results are evaluated based on Accuracy, Precision, Recall and F-measure.