{"title":"时间网络中事前影响最大化的链路预测","authors":"Eric Yanchenko, Tsuyoshi Murata, Petter Holme","doi":"10.1007/s41109-023-00594-z","DOIUrl":null,"url":null,"abstract":"Abstract Influence maximization (IM) is the task of finding the most important nodes in order to maximize the spread of influence or information on a network. This task is typically studied on static or temporal networks where the complete topology of the graph is known. In practice, however, the seed nodes must be selected before observing the future evolution of the network. In this work, we consider this realistic ex ante setting where p time steps of the network have been observed before selecting the seed nodes. Then the influence is calculated after the network continues to evolve for a total of $$T>p$$ <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\"> <mml:mrow> <mml:mi>T</mml:mi> <mml:mo>></mml:mo> <mml:mi>p</mml:mi> </mml:mrow> </mml:math> time steps. We address this problem by using statistical, non-negative matrix factorization and graph neural networks link prediction algorithms to predict the future evolution of the network, and then apply existing influence maximization algorithms on the predicted networks. Additionally, the output of the link prediction methods can be used to construct novel IM algorithms. We apply the proposed methods to eight real-world and synthetic networks to compare their performance using the susceptible-infected (SI) diffusion model. We demonstrate that it is possible to construct quality seed sets in the ex ante setting as we achieve influence spread within 87% of the optimal spread on seven of eight network. In many settings, choosing seed nodes based only historical edges provides results comparable to the results treating the future graph snapshots as known. The proposed heuristics based on the link prediction model are also some of the best-performing methods. These findings indicate that, for these eight networks under the SI model, the latent process which determines the most influential nodes may not have large temporal variation. Thus, knowing the future status of the network is not necessary to obtain good results for ex ante IM.","PeriodicalId":37010,"journal":{"name":"Applied Network Science","volume":"6 1","pages":"0"},"PeriodicalIF":1.3000,"publicationDate":"2023-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Link prediction for ex ante influence maximization on temporal networks\",\"authors\":\"Eric Yanchenko, Tsuyoshi Murata, Petter Holme\",\"doi\":\"10.1007/s41109-023-00594-z\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract Influence maximization (IM) is the task of finding the most important nodes in order to maximize the spread of influence or information on a network. This task is typically studied on static or temporal networks where the complete topology of the graph is known. In practice, however, the seed nodes must be selected before observing the future evolution of the network. In this work, we consider this realistic ex ante setting where p time steps of the network have been observed before selecting the seed nodes. Then the influence is calculated after the network continues to evolve for a total of $$T>p$$ <mml:math xmlns:mml=\\\"http://www.w3.org/1998/Math/MathML\\\"> <mml:mrow> <mml:mi>T</mml:mi> <mml:mo>></mml:mo> <mml:mi>p</mml:mi> </mml:mrow> </mml:math> time steps. We address this problem by using statistical, non-negative matrix factorization and graph neural networks link prediction algorithms to predict the future evolution of the network, and then apply existing influence maximization algorithms on the predicted networks. Additionally, the output of the link prediction methods can be used to construct novel IM algorithms. We apply the proposed methods to eight real-world and synthetic networks to compare their performance using the susceptible-infected (SI) diffusion model. We demonstrate that it is possible to construct quality seed sets in the ex ante setting as we achieve influence spread within 87% of the optimal spread on seven of eight network. In many settings, choosing seed nodes based only historical edges provides results comparable to the results treating the future graph snapshots as known. The proposed heuristics based on the link prediction model are also some of the best-performing methods. These findings indicate that, for these eight networks under the SI model, the latent process which determines the most influential nodes may not have large temporal variation. Thus, knowing the future status of the network is not necessary to obtain good results for ex ante IM.\",\"PeriodicalId\":37010,\"journal\":{\"name\":\"Applied Network Science\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":1.3000,\"publicationDate\":\"2023-09-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Network Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/s41109-023-00594-z\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Network Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s41109-023-00594-z","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 2
摘要
影响最大化(IM)是指在网络中找到最重要的节点,从而使影响或信息的传播最大化。该任务通常在静态或时态网络上进行研究,其中图的完整拓扑是已知的。然而,在实践中,在观察网络的未来演变之前,必须选择种子节点。在这项工作中,我们考虑了这种现实的事前设置,其中在选择种子节点之前已经观察了网络的p个时间步长。然后计算网络继续演化后的影响,总共为$$T>p$$ T &gt;P个时间步长。我们通过使用统计、非负矩阵分解和图神经网络链接预测算法来预测网络的未来演变,然后将现有的影响最大化算法应用于预测的网络。此外,链路预测方法的输出可用于构建新的IM算法。我们将提出的方法应用于八个真实世界和合成网络,使用易感感染(SI)扩散模型比较它们的性能。我们证明,当我们在87内实现影响传播时,在事前设置中构建优质种子集是可能的% of the optimal spread on seven of eight network. In many settings, choosing seed nodes based only historical edges provides results comparable to the results treating the future graph snapshots as known. The proposed heuristics based on the link prediction model are also some of the best-performing methods. These findings indicate that, for these eight networks under the SI model, the latent process which determines the most influential nodes may not have large temporal variation. Thus, knowing the future status of the network is not necessary to obtain good results for ex ante IM.
Link prediction for ex ante influence maximization on temporal networks
Abstract Influence maximization (IM) is the task of finding the most important nodes in order to maximize the spread of influence or information on a network. This task is typically studied on static or temporal networks where the complete topology of the graph is known. In practice, however, the seed nodes must be selected before observing the future evolution of the network. In this work, we consider this realistic ex ante setting where p time steps of the network have been observed before selecting the seed nodes. Then the influence is calculated after the network continues to evolve for a total of $$T>p$$ T>p time steps. We address this problem by using statistical, non-negative matrix factorization and graph neural networks link prediction algorithms to predict the future evolution of the network, and then apply existing influence maximization algorithms on the predicted networks. Additionally, the output of the link prediction methods can be used to construct novel IM algorithms. We apply the proposed methods to eight real-world and synthetic networks to compare their performance using the susceptible-infected (SI) diffusion model. We demonstrate that it is possible to construct quality seed sets in the ex ante setting as we achieve influence spread within 87% of the optimal spread on seven of eight network. In many settings, choosing seed nodes based only historical edges provides results comparable to the results treating the future graph snapshots as known. The proposed heuristics based on the link prediction model are also some of the best-performing methods. These findings indicate that, for these eight networks under the SI model, the latent process which determines the most influential nodes may not have large temporal variation. Thus, knowing the future status of the network is not necessary to obtain good results for ex ante IM.