动态聚类的社交媒体挖掘:以COVID-19推文为例

Hidetoshi Ito, B. Chakraborty
{"title":"动态聚类的社交媒体挖掘:以COVID-19推文为例","authors":"Hidetoshi Ito, B. Chakraborty","doi":"10.1109/iCAST51195.2020.9319496","DOIUrl":null,"url":null,"abstract":"Recently Social Networking Service (SNS) is used extensively due to proliferation of the Internet and cheaper, compact, easy to use computing devices. Texting, especially via Twitter, is very popular among people of all ages all over the world, and enormous text data is generated regularly which contains various types of information, rumors, sentimental expressions etc. The variety of topics related to the contents of the social media data are prone to changes with the passing of time and sometimes fade out completely after a certain time. Such time varying topics may include beneficial information that could be used for various decision making by general public as well as governmental organization. Especially for the recent pandemic of COVID-19, extraction and visualization of the changing needs of people might help them making some better countermeasures. In this study, COVID-19 related tweets have been collected and analyzed in units of time (hour, day and month) by means of various clustering models to visualize the dynamic changes of topics with time. It is found that Sentence-Bert is the most effective tool among the techniques used here though it is not yet enough for clear understanding of the topics semantically.","PeriodicalId":212570,"journal":{"name":"2020 11th International Conference on Awareness Science and Technology (iCAST)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Social Media Mining with Dynamic Clustering: A Case Study by COVID-19 Tweets\",\"authors\":\"Hidetoshi Ito, B. Chakraborty\",\"doi\":\"10.1109/iCAST51195.2020.9319496\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recently Social Networking Service (SNS) is used extensively due to proliferation of the Internet and cheaper, compact, easy to use computing devices. Texting, especially via Twitter, is very popular among people of all ages all over the world, and enormous text data is generated regularly which contains various types of information, rumors, sentimental expressions etc. The variety of topics related to the contents of the social media data are prone to changes with the passing of time and sometimes fade out completely after a certain time. Such time varying topics may include beneficial information that could be used for various decision making by general public as well as governmental organization. Especially for the recent pandemic of COVID-19, extraction and visualization of the changing needs of people might help them making some better countermeasures. In this study, COVID-19 related tweets have been collected and analyzed in units of time (hour, day and month) by means of various clustering models to visualize the dynamic changes of topics with time. It is found that Sentence-Bert is the most effective tool among the techniques used here though it is not yet enough for clear understanding of the topics semantically.\",\"PeriodicalId\":212570,\"journal\":{\"name\":\"2020 11th International Conference on Awareness Science and Technology (iCAST)\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 11th International Conference on Awareness Science and Technology (iCAST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/iCAST51195.2020.9319496\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 11th International Conference on Awareness Science and Technology (iCAST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/iCAST51195.2020.9319496","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

摘要

最近,由于互联网的普及和更便宜、更紧凑、更易于使用的计算设备,社交网络服务(SNS)得到了广泛的应用。短信,尤其是通过Twitter,在世界各地的各个年龄段的人们中都很受欢迎,并且经常产生大量的文本数据,其中包含各种类型的信息,谣言,情感表达等。与社交媒体数据内容相关的各种话题随着时间的推移会发生变化,有时会在一段时间后完全消失。这些随时间变化的主题可能包括有益的信息,可用于公众和政府组织的各种决策。特别是对于最近的COVID-19大流行,提取和可视化人们不断变化的需求可能有助于他们制定更好的对策。本研究采用不同的聚类模型,以时间为单位(小时、日、月)收集和分析与COVID-19相关的推文,可视化主题随时间的动态变化。研究发现,在本文所使用的技术中,Sentence-Bert是最有效的工具,尽管它还不足以在语义上清晰地理解主题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Social Media Mining with Dynamic Clustering: A Case Study by COVID-19 Tweets
Recently Social Networking Service (SNS) is used extensively due to proliferation of the Internet and cheaper, compact, easy to use computing devices. Texting, especially via Twitter, is very popular among people of all ages all over the world, and enormous text data is generated regularly which contains various types of information, rumors, sentimental expressions etc. The variety of topics related to the contents of the social media data are prone to changes with the passing of time and sometimes fade out completely after a certain time. Such time varying topics may include beneficial information that could be used for various decision making by general public as well as governmental organization. Especially for the recent pandemic of COVID-19, extraction and visualization of the changing needs of people might help them making some better countermeasures. In this study, COVID-19 related tweets have been collected and analyzed in units of time (hour, day and month) by means of various clustering models to visualize the dynamic changes of topics with time. It is found that Sentence-Bert is the most effective tool among the techniques used here though it is not yet enough for clear understanding of the topics semantically.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信