Sentiment Analysis on COVID Tweets: An Experimental Analysis on the Impact of Count Vectorizer and TF-IDF on Sentiment Predictions using Deep Learning Models

G. Raza, Zainab Saeed Butt, Seemab Latif, Abdul Wahid
{"title":"Sentiment Analysis on COVID Tweets: An Experimental Analysis on the Impact of Count Vectorizer and TF-IDF on Sentiment Predictions using Deep Learning Models","authors":"G. Raza, Zainab Saeed Butt, Seemab Latif, Abdul Wahid","doi":"10.1109/ICoDT252288.2021.9441508","DOIUrl":null,"url":null,"abstract":"Due to the higher popularity of social media and its excessive use, COVID-19 has become the topic of the talk since 2019 and it has become a cause of stress, anxiety and depression for people around the world. In this article, we experimented with different classifiers on COVID data to train deep neural networks to enhance the accuracy rate using two popular word embedding techniques: Count Vectorizer and Term Frequency-Inverse Document Frequency. Finally, we compare accuracies and observe that TF-IDF comes out to be more efficient as compared to Count Vectorizer where datasets are of huge volume and in our case i.e., for covid19 tweets, both vectorizers have been approximately similar in performance except on Single Layer Perceptron where Count Vectorizer results in 10% more efficiency in terms of accuracy.","PeriodicalId":207832,"journal":{"name":"2021 International Conference on Digital Futures and Transformative Technologies (ICoDT2)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Digital Futures and Transformative Technologies (ICoDT2)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICoDT252288.2021.9441508","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17

Abstract

Due to the higher popularity of social media and its excessive use, COVID-19 has become the topic of the talk since 2019 and it has become a cause of stress, anxiety and depression for people around the world. In this article, we experimented with different classifiers on COVID data to train deep neural networks to enhance the accuracy rate using two popular word embedding techniques: Count Vectorizer and Term Frequency-Inverse Document Frequency. Finally, we compare accuracies and observe that TF-IDF comes out to be more efficient as compared to Count Vectorizer where datasets are of huge volume and in our case i.e., for covid19 tweets, both vectorizers have been approximately similar in performance except on Single Layer Perceptron where Count Vectorizer results in 10% more efficiency in terms of accuracy.
COVID推文的情感分析:计数矢量器和TF-IDF对深度学习模型情感预测影响的实验分析
由于社交媒体的日益普及和过度使用,2019年以来,新冠肺炎成为人们谈论的话题,并成为世界各地人们压力、焦虑和抑郁的原因。在本文中,我们在COVID数据上实验了不同的分类器来训练深度神经网络,使用两种流行的词嵌入技术:计数矢量器和术语频率-逆文档频率来提高准确率。最后,我们比较了准确性,并观察到TF-IDF与数据集体积巨大的计数矢量器相比效率更高,在我们的情况下,即对于covid - 19推文,两个矢量器在性能上大致相似,除了单层感知器,其中计数矢量器在精度方面的效率提高了10%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信