Sentiment Analysis on COVID Tweets: An Experimental Analysis on the Impact of Count Vectorizer and TF-IDF on Sentiment Predictions using Deep Learning Models

2021 International Conference on Digital Futures and Transformative Technologies (ICoDT2) Pub Date : 2021-05-20 DOI:10.1109/ICoDT252288.2021.9441508

G. Raza, Zainab Saeed Butt, Seemab Latif, Abdul Wahid

引用次数: 17

Abstract

Due to the higher popularity of social media and its excessive use, COVID-19 has become the topic of the talk since 2019 and it has become a cause of stress, anxiety and depression for people around the world. In this article, we experimented with different classifiers on COVID data to train deep neural networks to enhance the accuracy rate using two popular word embedding techniques: Count Vectorizer and Term Frequency-Inverse Document Frequency. Finally, we compare accuracies and observe that TF-IDF comes out to be more efficient as compared to Count Vectorizer where datasets are of huge volume and in our case i.e., for covid19 tweets, both vectorizers have been approximately similar in performance except on Single Layer Perceptron where Count Vectorizer results in 10% more efficiency in terms of accuracy.

查看原文本刊更多论文

COVID推文的情感分析:计数矢量器和TF-IDF对深度学习模型情感预测影响的实验分析

由于社交媒体的日益普及和过度使用，2019年以来，新冠肺炎成为人们谈论的话题，并成为世界各地人们压力、焦虑和抑郁的原因。在本文中，我们在COVID数据上实验了不同的分类器来训练深度神经网络，使用两种流行的词嵌入技术:计数矢量器和术语频率-逆文档频率来提高准确率。最后，我们比较了准确性，并观察到TF-IDF与数据集体积巨大的计数矢量器相比效率更高，在我们的情况下，即对于covid - 19推文，两个矢量器在性能上大致相似，除了单层感知器，其中计数矢量器在精度方面的效率提高了10%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 International Conference on Digital Futures and Transformative Technologies (ICoDT2)

自引率

0.00%

发文量