covid - 19推特数据集情感分析

Anubhav Kumar, Kyongsik Yun, Teklay Gebregzabiher, Berihu Yohannes Tesfay, Solomon Gebremeskel Adane
{"title":"covid - 19推特数据集情感分析","authors":"Anubhav Kumar, Kyongsik Yun, Teklay Gebregzabiher, Berihu Yohannes Tesfay, Solomon Gebremeskel Adane","doi":"10.1109/CCICT53244.2021.00032","DOIUrl":null,"url":null,"abstract":"COVID19 (define as ‘CO’ stands for corona, ‘VI’ for virus, and ‘D’ for disease) is declared global pandemic by WHO. In starting of year 2020 it was limited with China but now More than 206 countries is affected due to this COVID-19 and more than 3.5 billion people infected on the globe and out of that more than 1 million people died due to this incurable disease. WHO did not approved any vaccine till current date. All people around the globe effected due to COVID19 and they wrote their view on social media mainly in Twitter. In span of last 9 month of time hundreds of billon text is written on twitter. Sentiment Analysis is natural language processing (NLP) application which is used to categories text sentiment as positive view, negative view or neutral. Different machine learning algorithms is used to extract sentiment from the text but those ML algorithms require text in specific. But that is major step in whole process of sentiment analysis because the data available at tweeter is available in raw form which required a lot of preprocessing and cleaning before using for sentiment analysis.In this article tweeter data related to COVID19 is discussed in detail like that what are different ways to use tweeter data for sentiment. What are different difficulties, what are different steps in tweeter data preprocessing, and finally ready form of dataset. Python is used as a programming language for sentiment analysis in this article. Same it is also used for data cleaning & preprocessing. Different python libraries which are used for data preprocessing also discussed.","PeriodicalId":213095,"journal":{"name":"2021 Fourth International Conference on Computational Intelligence and Communication Technologies (CCICT)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"COVID19 Tweeter Dataset Sentiment Analysis\",\"authors\":\"Anubhav Kumar, Kyongsik Yun, Teklay Gebregzabiher, Berihu Yohannes Tesfay, Solomon Gebremeskel Adane\",\"doi\":\"10.1109/CCICT53244.2021.00032\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"COVID19 (define as ‘CO’ stands for corona, ‘VI’ for virus, and ‘D’ for disease) is declared global pandemic by WHO. In starting of year 2020 it was limited with China but now More than 206 countries is affected due to this COVID-19 and more than 3.5 billion people infected on the globe and out of that more than 1 million people died due to this incurable disease. WHO did not approved any vaccine till current date. All people around the globe effected due to COVID19 and they wrote their view on social media mainly in Twitter. In span of last 9 month of time hundreds of billon text is written on twitter. Sentiment Analysis is natural language processing (NLP) application which is used to categories text sentiment as positive view, negative view or neutral. Different machine learning algorithms is used to extract sentiment from the text but those ML algorithms require text in specific. But that is major step in whole process of sentiment analysis because the data available at tweeter is available in raw form which required a lot of preprocessing and cleaning before using for sentiment analysis.In this article tweeter data related to COVID19 is discussed in detail like that what are different ways to use tweeter data for sentiment. What are different difficulties, what are different steps in tweeter data preprocessing, and finally ready form of dataset. Python is used as a programming language for sentiment analysis in this article. Same it is also used for data cleaning & preprocessing. Different python libraries which are used for data preprocessing also discussed.\",\"PeriodicalId\":213095,\"journal\":{\"name\":\"2021 Fourth International Conference on Computational Intelligence and Communication Technologies (CCICT)\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 Fourth International Conference on Computational Intelligence and Communication Technologies (CCICT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCICT53244.2021.00032\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Fourth International Conference on Computational Intelligence and Communication Technologies (CCICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCICT53244.2021.00032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

世卫组织宣布covid - 19(定义为“CO”代表冠状病毒,“VI”代表病毒,“D”代表疾病)为全球大流行。在2020年初,它仅限于中国,但现在超过206个国家受到这种COVID-19的影响,全球有超过35亿人感染,其中超过100万人死于这种无法治愈的疾病。到目前为止,世卫组织尚未批准任何疫苗。全球所有人都受到covid - 19的影响,他们主要在推特上在社交媒体上写下了自己的观点。在过去的9个月里,推特上写了数千亿的文字。情感分析是自然语言处理(NLP)的一种应用,用于将文本情感分类为积极观点、消极观点或中性观点。不同的机器学习算法用于从文本中提取情感,但这些ML算法需要特定的文本。但这是整个情感分析过程中的主要步骤,因为tweeter上可用的数据是原始形式的,在用于情感分析之前需要大量的预处理和清理。在本文中,详细讨论了与covid - 19相关的推特数据,例如使用推特数据的不同方法。在推特数据预处理中有哪些不同的难点,有哪些不同的步骤,最后准备好数据集的形式。本文使用Python作为情感分析的编程语言。同样,它也用于数据清洗和预处理。还讨论了用于数据预处理的不同python库。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
COVID19 Tweeter Dataset Sentiment Analysis
COVID19 (define as ‘CO’ stands for corona, ‘VI’ for virus, and ‘D’ for disease) is declared global pandemic by WHO. In starting of year 2020 it was limited with China but now More than 206 countries is affected due to this COVID-19 and more than 3.5 billion people infected on the globe and out of that more than 1 million people died due to this incurable disease. WHO did not approved any vaccine till current date. All people around the globe effected due to COVID19 and they wrote their view on social media mainly in Twitter. In span of last 9 month of time hundreds of billon text is written on twitter. Sentiment Analysis is natural language processing (NLP) application which is used to categories text sentiment as positive view, negative view or neutral. Different machine learning algorithms is used to extract sentiment from the text but those ML algorithms require text in specific. But that is major step in whole process of sentiment analysis because the data available at tweeter is available in raw form which required a lot of preprocessing and cleaning before using for sentiment analysis.In this article tweeter data related to COVID19 is discussed in detail like that what are different ways to use tweeter data for sentiment. What are different difficulties, what are different steps in tweeter data preprocessing, and finally ready form of dataset. Python is used as a programming language for sentiment analysis in this article. Same it is also used for data cleaning & preprocessing. Different python libraries which are used for data preprocessing also discussed.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信