On Efficient Single-Source Personalized PageRank Computation in Online Social Networks

IF 8.9 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Victor Junqiu Wei;Di Jiang;Jason Chen Zhang
{"title":"On Efficient Single-Source Personalized PageRank Computation in Online Social Networks","authors":"Victor Junqiu Wei;Di Jiang;Jason Chen Zhang","doi":"10.1109/TKDE.2025.3551751","DOIUrl":null,"url":null,"abstract":"The Single-Source Personalized PageRank (SSPPR) problem is widely used in information retrieval and recommendation systems. Traditional algorithms assume full knowledge of the network, making them inapplicable to online social networks (OSNs), where the topology is unknown, and users can only explore the network step by step via APIs. The only feasible approach for SSPPR in OSNs is Monte Carlo (MC) simulation, but traditional MC methods rely on static sampling, which lacks flexibility, delays feedback, and overestimates the number of required random walks. To address these limitations, we propose PANDA (Single-Source Personalized PageRank on OSNs with Rademacher Average), a progressive sampling algorithm. PANDA iteratively samples random walks in batches, estimating accuracy dynamically using Rademacher Average from statistical learning theory. This data-dependent approach allows for early termination once the desired accuracy is met. Additionally, PANDA features a dynamic sampling schedule to optimize efficiency. Empirical studies show that PANDA significantly outperforms existing methods, achieving the same accuracy with far greater efficiency.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 6","pages":"3598-3612"},"PeriodicalIF":8.9000,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10937368/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

The Single-Source Personalized PageRank (SSPPR) problem is widely used in information retrieval and recommendation systems. Traditional algorithms assume full knowledge of the network, making them inapplicable to online social networks (OSNs), where the topology is unknown, and users can only explore the network step by step via APIs. The only feasible approach for SSPPR in OSNs is Monte Carlo (MC) simulation, but traditional MC methods rely on static sampling, which lacks flexibility, delays feedback, and overestimates the number of required random walks. To address these limitations, we propose PANDA (Single-Source Personalized PageRank on OSNs with Rademacher Average), a progressive sampling algorithm. PANDA iteratively samples random walks in batches, estimating accuracy dynamically using Rademacher Average from statistical learning theory. This data-dependent approach allows for early termination once the desired accuracy is met. Additionally, PANDA features a dynamic sampling schedule to optimize efficiency. Empirical studies show that PANDA significantly outperforms existing methods, achieving the same accuracy with far greater efficiency.
在线社交网络中高效的单源个性化PageRank计算
单源个性化PageRank (SSPPR)问题在信息检索和推荐系统中得到了广泛的应用。传统算法假设对网络有充分的了解,不适用于网络拓扑未知的在线社交网络(online social network, osn),用户只能通过api逐步探索网络。对于SSPPR在OSNs中唯一可行的方法是蒙特卡罗(Monte Carlo, MC)模拟,但传统的MC方法依赖于静态采样,缺乏灵活性,延迟反馈,并且高估了所需随机游动的数量。为了解决这些限制,我们提出了一种渐进式采样算法PANDA (Single-Source personalpagerank on OSNs with Rademacher Average)。PANDA采用统计学习理论中的Rademacher Average对随机漫步进行分批迭代采样,动态估计准确率。这种依赖于数据的方法允许在满足所需精度后尽早终止。此外,PANDA还具有动态采样计划以优化效率。实证研究表明,PANDA明显优于现有的方法,以更高的效率达到相同的精度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Transactions on Knowledge and Data Engineering
IEEE Transactions on Knowledge and Data Engineering 工程技术-工程:电子与电气
CiteScore
11.70
自引率
3.40%
发文量
515
审稿时长
6 months
期刊介绍: The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信