Muhammad Anis Uddin Nasir, Fatemeh Rahimian, Sarunas Girdzijauskas
{"title":"Gossip-based partitioning and replication for Online Social Networks","authors":"Muhammad Anis Uddin Nasir, Fatemeh Rahimian, Sarunas Girdzijauskas","doi":"10.1109/ASONAM.2014.6921557","DOIUrl":null,"url":null,"abstract":"Online Social Networks (OSNs) have been gaining tremendous growth and popularity in the last decade, as they have been attracting billions of users from all over the world. Such networks generate petabytes of data from the social interactions among their users and create many management and scalability challenges. OSN users share common interests and exhibit strong community structures, which create complex dependability patterns within OSN data, thus, make it difficult to partition and distribute in a data center environment. Existing solutions, such as, distributed databases, key-value stores and auto scaling services use random partitioning to distribute the data across a cluster, which breaks existing dependencies of the OSN data and may generate huge inter-server traffic. Therefore, there is a need for intelligent data allocation strategy that can reduce the network cost for various OSN operations. In this paper, we present a gossip-based partitioning and replication scheme that efficiently splits OSN data and distributes the data across a cluster. We achieve fault tolerance and data locality, for one-hop neighbors, through replication. Our main contribution is a social graph placement strategy that divides the social graph into predefined size partitions and periodically updates the partitions to place socially connected users together. To evaluate our algorithm, we compare it with random partitioning and a state-of-the-art solution SPAR. Results show that our algorithm generates up to four times less replication overhead compared to random partitioning and half the replication overhead compared to SPAR.","PeriodicalId":143584,"journal":{"name":"2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASONAM.2014.6921557","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Online Social Networks (OSNs) have been gaining tremendous growth and popularity in the last decade, as they have been attracting billions of users from all over the world. Such networks generate petabytes of data from the social interactions among their users and create many management and scalability challenges. OSN users share common interests and exhibit strong community structures, which create complex dependability patterns within OSN data, thus, make it difficult to partition and distribute in a data center environment. Existing solutions, such as, distributed databases, key-value stores and auto scaling services use random partitioning to distribute the data across a cluster, which breaks existing dependencies of the OSN data and may generate huge inter-server traffic. Therefore, there is a need for intelligent data allocation strategy that can reduce the network cost for various OSN operations. In this paper, we present a gossip-based partitioning and replication scheme that efficiently splits OSN data and distributes the data across a cluster. We achieve fault tolerance and data locality, for one-hop neighbors, through replication. Our main contribution is a social graph placement strategy that divides the social graph into predefined size partitions and periodically updates the partitions to place socially connected users together. To evaluate our algorithm, we compare it with random partitioning and a state-of-the-art solution SPAR. Results show that our algorithm generates up to four times less replication overhead compared to random partitioning and half the replication overhead compared to SPAR.