Experiments with computing similarity coefficient over big data

M. Cosulschi, M. Gabroveanu, Florin Slabu, Adriana Sbircea
{"title":"Experiments with computing similarity coefficient over big data","authors":"M. Cosulschi, M. Gabroveanu, Florin Slabu, Adriana Sbircea","doi":"10.1109/IISA.2014.6878734","DOIUrl":null,"url":null,"abstract":"Big data is a hot topic nowadays due to the huge amount of data resulted from various commercial processes and also due to every day data handled by social networks. The MapReduce programming model focuses on processing and generating large data sets. Using the values obtained by computing the Jaccard similarity coefficients for two very large graphs, we have analysed the connections and influences that some nodes have over the other nodes. Furthermore, we have shown how Apache Hadoop framework and MapReduce programming model can be used for high volume computations. All tests were performed on a distributed cluster in order to obtain the results described in the paper.","PeriodicalId":298835,"journal":{"name":"IISA 2014, The 5th International Conference on Information, Intelligence, Systems and Applications","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IISA 2014, The 5th International Conference on Information, Intelligence, Systems and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IISA.2014.6878734","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Big data is a hot topic nowadays due to the huge amount of data resulted from various commercial processes and also due to every day data handled by social networks. The MapReduce programming model focuses on processing and generating large data sets. Using the values obtained by computing the Jaccard similarity coefficients for two very large graphs, we have analysed the connections and influences that some nodes have over the other nodes. Furthermore, we have shown how Apache Hadoop framework and MapReduce programming model can be used for high volume computations. All tests were performed on a distributed cluster in order to obtain the results described in the paper.
计算大数据相似系数的实验
由于各种商业流程产生的大量数据以及社交网络每天处理的数据,大数据成为当今的热门话题。MapReduce编程模型侧重于处理和生成大型数据集。利用计算两个非常大的图的Jaccard相似系数得到的值,我们分析了一些节点对其他节点的连接和影响。此外,我们还展示了如何使用Apache Hadoop框架和MapReduce编程模型进行大容量计算。所有的测试都是在一个分布式集群上进行的,以获得本文中描述的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信