CBL: exploiting community based locality for efficient content search in online social networks

Hanhua Chen, Fan Zhang, Hai Jin
{"title":"CBL: exploiting community based locality for efficient content search in online social networks","authors":"Hanhua Chen, Fan Zhang, Hai Jin","doi":"10.1145/2600212.2600707","DOIUrl":null,"url":null,"abstract":"Retrieving relevant data for users in online social network (OSN) systems is a challenging problem. Cassandra, a storage system used by popular OSN systems, such as Facebook and Twitter, relies on a DHT-based scheme to randomly partition the personal data of users among servers across multiple data centers. Although DHT is highly scalable for hosting a large number of users (personal data), it leads to costly inter-server communications across data centers due to the complex interconnection and interaction among OSN users. In this paper, we explore how to retrieve the OSN content in a cost-effective way by retaining the simple and robust nature of OSNs. Our approach exploits a simple, yet powerful principle called Community-Based Locality (CBL), which posits that if a user has an one-hop neighbor within a particular community, it is very likely that the user has other one-hop neighbors inside the same community. We demonstrate the existence of community-based locality in diverse traces of popular OSN systems such as Facebook, Orkut, Flickr, Youtube, and Livejournal.\n Based on the observation, we design a CBL-based algorithm to build the content index in OSN systems. By partitioning and indexing the relevant data of users within a community on the same server in the data center, the CBL-based index avoids a significant amount of inter-server communications during searching, making retrieving relevant data for a user in large-scale OSNs efficient. In addition, by using CBL-based scheme we can provide much shorter query latency and balanced loads. We conduct comprehensive trace-driven simulations to evaluate the performance of the proposed scheme. Results show that our scheme significantly reduces the network traffic by 73% compared with existing schemes.","PeriodicalId":330072,"journal":{"name":"IEEE International Symposium on High-Performance Parallel Distributed Computing","volume":"118 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE International Symposium on High-Performance Parallel Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2600212.2600707","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Retrieving relevant data for users in online social network (OSN) systems is a challenging problem. Cassandra, a storage system used by popular OSN systems, such as Facebook and Twitter, relies on a DHT-based scheme to randomly partition the personal data of users among servers across multiple data centers. Although DHT is highly scalable for hosting a large number of users (personal data), it leads to costly inter-server communications across data centers due to the complex interconnection and interaction among OSN users. In this paper, we explore how to retrieve the OSN content in a cost-effective way by retaining the simple and robust nature of OSNs. Our approach exploits a simple, yet powerful principle called Community-Based Locality (CBL), which posits that if a user has an one-hop neighbor within a particular community, it is very likely that the user has other one-hop neighbors inside the same community. We demonstrate the existence of community-based locality in diverse traces of popular OSN systems such as Facebook, Orkut, Flickr, Youtube, and Livejournal. Based on the observation, we design a CBL-based algorithm to build the content index in OSN systems. By partitioning and indexing the relevant data of users within a community on the same server in the data center, the CBL-based index avoids a significant amount of inter-server communications during searching, making retrieving relevant data for a user in large-scale OSNs efficient. In addition, by using CBL-based scheme we can provide much shorter query latency and balanced loads. We conduct comprehensive trace-driven simulations to evaluate the performance of the proposed scheme. Results show that our scheme significantly reduces the network traffic by 73% compared with existing schemes.
CBL:利用基于社区的局部性在在线社交网络中进行高效的内容搜索
在在线社交网络(OSN)系统中,为用户检索相关数据是一个具有挑战性的问题。Cassandra是Facebook和Twitter等流行OSN系统使用的存储系统,它依靠基于dht的方案在多个数据中心的服务器之间随机分区用户的个人数据。虽然DHT对于承载大量用户(个人数据)具有很高的可扩展性,但由于OSN用户之间的互连和交互复杂,导致跨数据中心的服务器间通信成本很高。在本文中,我们探讨了如何通过保留OSN的简单和健壮性,以一种经济有效的方式检索OSN的内容。我们的方法利用了一个简单而强大的原则,称为基于社区的局域性(CBL),它假设如果用户在特定社区中有一个单跳邻居,那么该用户很可能在同一社区中有其他单跳邻居。我们在流行的OSN系统(如Facebook、Orkut、Flickr、Youtube和Livejournal)的各种痕迹中展示了基于社区的局部性的存在。在此基础上,我们设计了一种基于cbl的算法来构建OSN系统中的内容索引。通过在数据中心的同一台服务器上对社区内用户的相关数据进行分区和索引,可以避免在搜索过程中大量的服务器间通信,从而提高大规模osn中用户相关数据的检索效率。此外,通过使用基于cbl的方案,我们可以提供更短的查询延迟和均衡的负载。我们进行了全面的跟踪驱动模拟来评估所提出方案的性能。结果表明,与现有方案相比,我们的方案显著减少了73%的网络流量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信