Rene Pickhardt, Thomas Gottron, A. Scherp, Steffen Staab, J. Kunze
{"title":"Efficient Graph Models for Retrieving Top-k News Feeds from Ego Networks","authors":"Rene Pickhardt, Thomas Gottron, A. Scherp, Steffen Staab, J. Kunze","doi":"10.1109/SocialCom-PASSAT.2012.73","DOIUrl":null,"url":null,"abstract":"A key challenge of web platforms like social networking sites and services for news feed aggregation is the efficient and targeted distribution of new content items to users. This can be formulated as the problem of retrieving the top-k news items out of the d-degree ego network of each given user, where the set of all users producing feeds is of size n, with n ≫ d ≫ k and typically k <; 20. Existing approaches employ either expensive join operations on global indices or suffer from high redundancy through denormalization. This makes retrieval of different top-k news feeds for thousands of users per second very inefficient in a large social network. In this paper, we propose two graph models GRAPHITY and STOU to address this problem. GRAPHITY is optimized for fast retrieval of news feeds and has a runtime of O(k log(k)). The GRAPHITY index does not involve data redundancy. An update of the index upon insertion of a new item to the feed is possible in a runtime linear to the nodes' indegree din. New content can be stored in STOU in O(1) at the cost of slower retrieval speed of O(d log(d)). We verify the theoretical runtime complexity of GRAPHITY and STOU on two data sets of different characteristics and size. We show that on a single machine GRAPHITY is able to retrieve more than 10 000 unique news feeds per second in a network with more than one million users. Our evaluation confirms that retrieval of news feeds with GRAPHITY is independent of the node degree d of a user's ego network and network size n and does scale to networks of arbitrary size.","PeriodicalId":129526,"journal":{"name":"2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing","volume":"78 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SocialCom-PASSAT.2012.73","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
A key challenge of web platforms like social networking sites and services for news feed aggregation is the efficient and targeted distribution of new content items to users. This can be formulated as the problem of retrieving the top-k news items out of the d-degree ego network of each given user, where the set of all users producing feeds is of size n, with n ≫ d ≫ k and typically k <; 20. Existing approaches employ either expensive join operations on global indices or suffer from high redundancy through denormalization. This makes retrieval of different top-k news feeds for thousands of users per second very inefficient in a large social network. In this paper, we propose two graph models GRAPHITY and STOU to address this problem. GRAPHITY is optimized for fast retrieval of news feeds and has a runtime of O(k log(k)). The GRAPHITY index does not involve data redundancy. An update of the index upon insertion of a new item to the feed is possible in a runtime linear to the nodes' indegree din. New content can be stored in STOU in O(1) at the cost of slower retrieval speed of O(d log(d)). We verify the theoretical runtime complexity of GRAPHITY and STOU on two data sets of different characteristics and size. We show that on a single machine GRAPHITY is able to retrieve more than 10 000 unique news feeds per second in a network with more than one million users. Our evaluation confirms that retrieval of news feeds with GRAPHITY is independent of the node degree d of a user's ego network and network size n and does scale to networks of arbitrary size.