基于词嵌入和词典的Twitter数据情感分类

R. Raj, Prasanjeet Das, P. Sahu
{"title":"基于词嵌入和词典的Twitter数据情感分类","authors":"R. Raj, Prasanjeet Das, P. Sahu","doi":"10.1109/CSNT48778.2020.9115750","DOIUrl":null,"url":null,"abstract":"Twitter is one of the leading social media platforms for its users for tweeting 280 characters in a single tweet. Social media influence the users to share data, promoting advertisement and posting useful information to the followers. The social media data helps the individuals and business personals to take decision based on the analysis with the data. Twitter sentimental analysis is important for identifying the similar text patterns present in the given input text. The analysis further classified with emotional, positive and negative tweets. Existing approaches limited in accuracy, therefore, the word embedding and lexicon based approach has been introduced for increasing the accuracy. The twitter data stream is taken as an input and preprocessed by removing stop words, hash tags and urls. Further, the data is tokenized and applied with the word embedding method to detect the location and lexicon based approach to segregate the sentimental and emotional tweets. The system has been tested with the live data set as well as offline dataset, and the result shows very promising.","PeriodicalId":131745,"journal":{"name":"2020 IEEE 9th International Conference on Communication Systems and Network Technologies (CSNT)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Emotion Classification on Twitter Data Using Word Embedding and Lexicon Based Approach\",\"authors\":\"R. Raj, Prasanjeet Das, P. Sahu\",\"doi\":\"10.1109/CSNT48778.2020.9115750\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Twitter is one of the leading social media platforms for its users for tweeting 280 characters in a single tweet. Social media influence the users to share data, promoting advertisement and posting useful information to the followers. The social media data helps the individuals and business personals to take decision based on the analysis with the data. Twitter sentimental analysis is important for identifying the similar text patterns present in the given input text. The analysis further classified with emotional, positive and negative tweets. Existing approaches limited in accuracy, therefore, the word embedding and lexicon based approach has been introduced for increasing the accuracy. The twitter data stream is taken as an input and preprocessed by removing stop words, hash tags and urls. Further, the data is tokenized and applied with the word embedding method to detect the location and lexicon based approach to segregate the sentimental and emotional tweets. The system has been tested with the live data set as well as offline dataset, and the result shows very promising.\",\"PeriodicalId\":131745,\"journal\":{\"name\":\"2020 IEEE 9th International Conference on Communication Systems and Network Technologies (CSNT)\",\"volume\":\"53 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE 9th International Conference on Communication Systems and Network Technologies (CSNT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CSNT48778.2020.9115750\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 9th International Conference on Communication Systems and Network Technologies (CSNT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSNT48778.2020.9115750","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

推特是领先的社交媒体平台之一,它的用户可以在一条推文中发布280个字符。社交媒体影响用户分享数据,推广广告,向关注者发布有用的信息。社交媒体数据帮助个人和商业人士根据对数据的分析做出决策。Twitter情感分析对于识别给定输入文本中出现的相似文本模式非常重要。分析进一步将推文分为情绪、积极和消极三类。现有的方法在准确率上存在一定的局限性,因此引入了词嵌入和基于词汇的方法来提高准确率。twitter数据流作为输入,并通过删除停止词、散列标签和url进行预处理。在此基础上,对数据进行标记并应用词嵌入方法检测位置和基于词汇的方法分离情感和情感推文。该系统已在实时数据集和离线数据集上进行了测试,结果显示出良好的前景。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Emotion Classification on Twitter Data Using Word Embedding and Lexicon Based Approach
Twitter is one of the leading social media platforms for its users for tweeting 280 characters in a single tweet. Social media influence the users to share data, promoting advertisement and posting useful information to the followers. The social media data helps the individuals and business personals to take decision based on the analysis with the data. Twitter sentimental analysis is important for identifying the similar text patterns present in the given input text. The analysis further classified with emotional, positive and negative tweets. Existing approaches limited in accuracy, therefore, the word embedding and lexicon based approach has been introduced for increasing the accuracy. The twitter data stream is taken as an input and preprocessed by removing stop words, hash tags and urls. Further, the data is tokenized and applied with the word embedding method to detect the location and lexicon based approach to segregate the sentimental and emotional tweets. The system has been tested with the live data set as well as offline dataset, and the result shows very promising.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信