Predicting the Political Polarity of Tweets Using Supervised Machine Learning

Michelle Voong, Keerthana Gunda, S. Gokhale
{"title":"Predicting the Political Polarity of Tweets Using Supervised Machine Learning","authors":"Michelle Voong, Keerthana Gunda, S. Gokhale","doi":"10.1109/COMPSAC48688.2020.000-9","DOIUrl":null,"url":null,"abstract":"With the advent of social media; politicians, media outlets, and ordinary citizens alike are routinely turning to Twitter to share their thoughts and feelings. Discerning politically biased tweets from neutral ones can assist in determining the propensity of an elected official or a media outlet in engaging in political rhetoric. This paper presents a supervised machine learning approach to predict whether a tweet is politically biased or neutral. The approach uses a labeled data set available at Crowdflower, where each tweet is tagged with a partisan/neutral label plus its message type and audience. The approach considers a combination of linguistic features including Term Frequency-Inverse Document Frequency (TF-IDF), bigrams, and trigrams along with metadata features including mentions, retweets, and URLs, as well as the additional labels of message type and audience. It trains both simple and ensemble classifiers and assesses their performance using precision, recall, and F1-score. The results demonstrate that the classifiers can predict the polarity of a tweet accurately when trained on a combination of TF-IDF and metadata features that can be extracted automatically from the tweets, eliminating the need for additional tagging which is manual, cumbersome and error prone.","PeriodicalId":430098,"journal":{"name":"2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COMPSAC48688.2020.000-9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

With the advent of social media; politicians, media outlets, and ordinary citizens alike are routinely turning to Twitter to share their thoughts and feelings. Discerning politically biased tweets from neutral ones can assist in determining the propensity of an elected official or a media outlet in engaging in political rhetoric. This paper presents a supervised machine learning approach to predict whether a tweet is politically biased or neutral. The approach uses a labeled data set available at Crowdflower, where each tweet is tagged with a partisan/neutral label plus its message type and audience. The approach considers a combination of linguistic features including Term Frequency-Inverse Document Frequency (TF-IDF), bigrams, and trigrams along with metadata features including mentions, retweets, and URLs, as well as the additional labels of message type and audience. It trains both simple and ensemble classifiers and assesses their performance using precision, recall, and F1-score. The results demonstrate that the classifiers can predict the polarity of a tweet accurately when trained on a combination of TF-IDF and metadata features that can be extracted automatically from the tweets, eliminating the need for additional tagging which is manual, cumbersome and error prone.
使用监督机器学习预测推文的政治极性
随着社交媒体的出现;政治家、媒体机构和普通公民都经常求助于Twitter来分享他们的想法和感受。从中立的推文中辨别出政治偏见的推文,有助于确定当选官员或媒体参与政治言论的倾向。本文提出了一种有监督的机器学习方法来预测推文是政治偏见还是中立。该方法使用了Crowdflower提供的标记数据集,其中每条tweet都标有党派/中立标签以及其消息类型和受众。该方法考虑了多种语言特性的组合,包括术语频率-逆文档频率(TF-IDF)、双引号和三元组,以及元数据特性,包括提及、转发和url,以及消息类型和受众的附加标签。它训练简单分类器和集成分类器,并使用精度、召回率和f1分数来评估它们的性能。结果表明,分类器在结合TF-IDF和元数据特征(可以从tweet中自动提取)进行训练时,可以准确地预测tweet的极性,从而消除了手动、繁琐且容易出错的额外标记的需要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信