Mining Twitter in the Cloud: A Case Study

P. Noordhuis, M. Heijkoop, A. Lazovik
{"title":"Mining Twitter in the Cloud: A Case Study","authors":"P. Noordhuis, M. Heijkoop, A. Lazovik","doi":"10.1109/CLOUD.2010.59","DOIUrl":null,"url":null,"abstract":"Mining and analyzing data from social networks can be difficult because of the large amounts of data involved. Such activities are usually very expensive, as they require a lot of computational resources. With the recent success of cloud computing, data analysis is going to be more accessible due to easier access to less expensive computational resources. In this work we propose to use cloud computing services as a possible solution for analysis of large amounts of data. As a source for a large data set, we propose to use Twitter, yielding a graph with 50 million nodes and 1.8 billion edges. In this paper, we use computation of PageRank on Twitter’s social graph to investigate whether or not cloud computing, and Amazon cloud services1 in particular, can make these tasks more feasible and, as a side effect, whether or not PageRank provides a good ranking of Twitter users.","PeriodicalId":375404,"journal":{"name":"2010 IEEE 3rd International Conference on Cloud Computing","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"56","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE 3rd International Conference on Cloud Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CLOUD.2010.59","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 56

Abstract

Mining and analyzing data from social networks can be difficult because of the large amounts of data involved. Such activities are usually very expensive, as they require a lot of computational resources. With the recent success of cloud computing, data analysis is going to be more accessible due to easier access to less expensive computational resources. In this work we propose to use cloud computing services as a possible solution for analysis of large amounts of data. As a source for a large data set, we propose to use Twitter, yielding a graph with 50 million nodes and 1.8 billion edges. In this paper, we use computation of PageRank on Twitter’s social graph to investigate whether or not cloud computing, and Amazon cloud services1 in particular, can make these tasks more feasible and, as a side effect, whether or not PageRank provides a good ranking of Twitter users.
在云端挖掘Twitter:一个案例研究
由于涉及大量数据,从社交网络中挖掘和分析数据可能很困难。此类活动通常非常昂贵,因为它们需要大量的计算资源。随着最近云计算的成功,数据分析将更容易获得,因为更容易获得更便宜的计算资源。在这项工作中,我们建议使用云计算服务作为分析大量数据的可能解决方案。作为大型数据集的来源,我们建议使用Twitter,生成具有5000万个节点和18亿个边的图。在本文中,我们通过计算Twitter社交图上的PageRank来研究云计算,特别是亚马逊的云服务1是否可以使这些任务更加可行,以及作为副作用,PageRank是否提供了一个很好的Twitter用户排名。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信