Influence Maximization in Near-Linear Time: A Martingale Approach

Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data Pub Date : 2015-05-27 DOI:10.1145/2723372.2723734

Youze Tang, Yanchen Shi, Xiaokui Xiao

{"title":"Influence Maximization in Near-Linear Time: A Martingale Approach","authors":"Youze Tang, Yanchen Shi, Xiaokui Xiao","doi":"10.1145/2723372.2723734","DOIUrl":null,"url":null,"abstract":"Given a social network G and a positive integer k, the influence maximization problem asks for k nodes (in G) whose adoptions of a certain idea or product can trigger the largest expected number of follow-up adoptions by the remaining nodes. This problem has been extensively studied in the literature, and the state-of-the-art technique runs in O((k+l) (n+m) log n ε2) expected time and returns a (1-1 e-ε)-approximate solution with at least 1 - 1/n l probability. This paper presents an influence maximization algorithm that provides the same worst-case guarantees as the state of the art, but offers significantly improved empirical efficiency. The core of our algorithm is a set of estimation techniques based on martingales, a classic statistical tool. Those techniques not only provide accurate results with small computation overheads, but also enable our algorithm to support a larger class of information diffusion models than existing methods do. We experimentally evaluate our algorithm against the states of the art under several popular diffusion models, using real social networks with up to 1.4 billion edges. Our experimental results show that the proposed algorithm consistently outperforms the states of the art in terms of computation efficiency, and is often orders of magnitude faster.","PeriodicalId":168391,"journal":{"name":"Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data","volume":"351 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"674","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2723372.2723734","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 674

Abstract

Given a social network G and a positive integer k, the influence maximization problem asks for k nodes (in G) whose adoptions of a certain idea or product can trigger the largest expected number of follow-up adoptions by the remaining nodes. This problem has been extensively studied in the literature, and the state-of-the-art technique runs in O((k+l) (n+m) log n ε2) expected time and returns a (1-1 e-ε)-approximate solution with at least 1 - 1/n l probability. This paper presents an influence maximization algorithm that provides the same worst-case guarantees as the state of the art, but offers significantly improved empirical efficiency. The core of our algorithm is a set of estimation techniques based on martingales, a classic statistical tool. Those techniques not only provide accurate results with small computation overheads, but also enable our algorithm to support a larger class of information diffusion models than existing methods do. We experimentally evaluate our algorithm against the states of the art under several popular diffusion models, using real social networks with up to 1.4 billion edges. Our experimental results show that the proposed algorithm consistently outperforms the states of the art in terms of computation efficiency, and is often orders of magnitude faster.

查看原文本刊更多论文

近线性时间影响最大化:鞅方法

给定一个社交网络G和一个正整数k，影响最大化问题要求k个节点(在G中)，这些节点对某种想法或产品的采用能够引发剩余节点对后续采用的最大期望数量。这个问题已经在文献中得到了广泛的研究，最先进的技术在O((k+l) (n+m) log n ε2)预期时间内运行，并以至少1-1 / nl的概率返回(1-1 e-ε)-近似解。本文提出了一种影响最大化算法，该算法提供了与现有算法相同的最坏情况保证，但显著提高了经验效率。我们算法的核心是一套基于鞅的估计技术，鞅是一种经典的统计工具。这些技术不仅以较小的计算开销提供准确的结果，而且使我们的算法能够支持比现有方法更大的信息扩散模型类别。我们在几个流行的扩散模型下对我们的算法进行了实验评估，使用了具有多达14亿个边的真实社交网络。我们的实验结果表明，所提出的算法在计算效率方面始终优于目前的技术状态，并且通常要快几个数量级。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data

自引率

0.00%

发文量