Proceedings of the ACM on Management of Data最新文献

筛选
英文 中文
Efficient Algorithm for Budgeted Adaptive Influence Maximization: An Incremental RR-set Update Approach 预算自适应影响最大化的有效算法:一种增量rr集更新方法
Proceedings of the ACM on Management of Data Pub Date : 2023-11-13 DOI: 10.1145/3617328
Qintian Guo, Chen Feng, Fangyuan Zhang, Sibo Wang
{"title":"Efficient Algorithm for Budgeted Adaptive Influence Maximization: An Incremental RR-set Update Approach","authors":"Qintian Guo, Chen Feng, Fangyuan Zhang, Sibo Wang","doi":"10.1145/3617328","DOIUrl":"https://doi.org/10.1145/3617328","url":null,"abstract":"Given a graph G, a cost associated with each node, and a budget B, the budgeted influence maximization (BIM) aims to find the optimal set S of seed nodes that maximizes the influence among all possible sets such that the total cost of nodes in S is no larger than B. Existing solutions mainly follow the non-adaptive idea, i.e., determining all the seeds before observing any actual diffusion. Due to the absence of actual diffusion information, they may result in unsatisfactory influence spread. Motivated by the limitation of existing solutions, in this paper, we make the first attempt to solve the BIM problem under the adaptive setting, where seed nodes are iteratively selected after observing the diffusion result of the previous seeds. We design the first practical algorithm which achieves an expected approximation guarantee by probabilistically adopting a cost-aware greedy idea or a single influential node. Further, we develop an optimized version to improve its practical performance in terms of influence spread. Besides, the scalability issues of the adaptive IM-related problems still remain open. It is because they usually involve multiple rounds (e.g., equal to the number of seeds) and in each round, they have to construct sufficient new reverse-reachable set (RR-set) samples such that the claimed approximation guarantee can actually hold. However, this incurs prohibitive computation, imposing limitations on real applications. To solve this dilemma, we propose an incremental update approach. Specifically, it maintains extra construction information when building RR-sets, and then it can quickly correct a problematic RR-set from the very step where it is first affected. As a result, we recycle the RR-sets at a small computational cost, while still providing correctness guarantee. Finally, extensive experiments on large-scale real graphs demonstrate the superiority of our algorithms over baselines in terms of both influence spread and running time.","PeriodicalId":498157,"journal":{"name":"Proceedings of the ACM on Management of Data","volume":"35 9","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136281450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Secure Sampling for Approximate Multi-party Query Processing 近似多方查询处理的安全抽样
Proceedings of the ACM on Management of Data Pub Date : 2023-11-13 DOI: 10.1145/3617339
Qiyao Luo, Yilei Wang, Ke Yi, Sheng Wang, Feifei Li
{"title":"Secure Sampling for Approximate Multi-party Query Processing","authors":"Qiyao Luo, Yilei Wang, Ke Yi, Sheng Wang, Feifei Li","doi":"10.1145/3617339","DOIUrl":"https://doi.org/10.1145/3617339","url":null,"abstract":"We study the problem of random sampling in the secure multi-party computation (MPC) model. In MPC, taking a sample securely must have a cost Ω(n) irrespective to the sample size s. This is in stark contrast with the plaintext setting, where a sample can be taken in O(s) time trivially. Thus, the goal of approximate query processing (AQP) with sublinear costs seems unachievable under MPC. To get around this inherent barrier, in this paper we take a two-stage approach: In the offline stage, we generate a batch of n/s samples with (n) total cost, which can then be consumed to answer queries as they arrive online. Such an approach allows us to achieve an Õ(s) amortized cost per query, similar to the plaintext setting. Based on our secure batch sampling algorithms, we build MASQUE, an MPC-AQP system that achieves sublinear online query costs by running an MPC protocol to evaluate the queries on pre-generated samples. MASQUE achieves the strong security guarantee of the MPC model, i.e., nothing is revealed beyond the query result, which itself can be further protected by (amplified) differential privacy","PeriodicalId":498157,"journal":{"name":"Proceedings of the ACM on Management of Data","volume":"35 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136281453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modularity-based Hypergraph Clustering: Random Hypergraph Model, Hyperedge-cluster Relation, and Computation 基于模块化的超图聚类:随机超图模型、超边缘聚类关系和计算
Proceedings of the ACM on Management of Data Pub Date : 2023-11-13 DOI: 10.1145/3617335
Zijin Feng, Miao Qiao, Hong Cheng
{"title":"Modularity-based Hypergraph Clustering: Random Hypergraph Model, Hyperedge-cluster Relation, and Computation","authors":"Zijin Feng, Miao Qiao, Hong Cheng","doi":"10.1145/3617335","DOIUrl":"https://doi.org/10.1145/3617335","url":null,"abstract":"A graph models the connections among objects. One important graph analytical task is clustering which partitions a data graph into clusters with dense innercluster connections. A line of clustering maximizes a function called modularity. Modularity-based clustering is widely adopted on dyadic graphs due to its scalability and clustering quality which depends highly on its selection of a random graph model. The random graph model decides not only which clustering is preferred - modularity measures the quality of a clustering based on its alignment to the edges of a random graph, but also the cost of computing such an alignment. Existing random hypergraph models either measure the hyperedge-cluster alignment in an All-Or-Nothing (AON) manner, losing important group-wise information, or introduce expensive alignment computation, refraining the clustering from scaling up. This paper proposes a new random hypergraph model called Hyperedge Expansion Model (HEM), a non-AON hypergraph modularity function called Partial Innerclusteredge modularity (PI) based on HEM, a clustering algorithm called Partial Innerclusteredge Clustering (PIC) that optimizes PI, and novel computation optimizations. PIC is a scalable modularity-based hypergraph clustering that can effectively capture the non-AON hyperedge-cluster relation. Our experiments show that PIC outperforms eight state-of-the-art methods on real-world hypergraphs in terms of both clustering quality and scalability and is up to five orders of magnitude faster than the baseline methods.","PeriodicalId":498157,"journal":{"name":"Proceedings of the ACM on Management of Data","volume":"34 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136281952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fast Maximal Quasi-clique Enumeration: A Pruning and Branching Co-Design Approach 快速最大拟团枚举:一种剪枝和分支协同设计方法
Proceedings of the ACM on Management of Data Pub Date : 2023-11-13 DOI: 10.1145/3617331
Kaiqiang Yu, Cheng Long
{"title":"Fast Maximal Quasi-clique Enumeration: A Pruning and Branching Co-Design Approach","authors":"Kaiqiang Yu, Cheng Long","doi":"10.1145/3617331","DOIUrl":"https://doi.org/10.1145/3617331","url":null,"abstract":"Mining cohesive subgraphs from a graph is a fundamental problem in graph data analysis. One notable cohesive structure is γ-quasi-clique (QC), where each vertex connects at least a fraction γ of the other vertices inside. Enumerating maximal γ-quasi-cliques (MQCs) of a graph has been widely studied and used for many applications such as community detection and significant biomolecule structure discovery. One common practice of finding all MQCs is to (1) find a set of QCs containing all MQCs and then (2) filter out non-maximal QCs. While quite a few algorithms have been developed (which are branch-and-bound algorithms) for finding a set of QCs that contains all MQCs, all focus on sharpening the pruning techniques and devote little effort to improving the branching part. As a result, they provide no guarantee on pruning branches and all have the worst-case time complexity of O*(2n), where O* suppresses the polynomials and n is the number of vertices in the graph. In this paper, we focus on the problem of finding a set of QCs containing all MQCs but deviate from further sharpening the pruning techniques as existing methods do. We pay attention to both the pruning and branching parts and develop new pruning techniques and branching methods that would suit each other better towards pruning more branches both theoretically and practically. Specifically, we develop a new branch-and-bound algorithm called FastQC based on newly developed pruning techniques and branching methods, which improves the worst-case time complexity to O*(αkn), where αk is a positive real number strictly smaller than 2. Furthermore, we develop a divide-and-conquer strategy for boosting the performance of FastQC. Finally, we conduct extensive experiments on both real and synthetic datasets, and the results show that our algorithms are up to two orders of magnitude faster than the state-of-the-art on real datasets.","PeriodicalId":498157,"journal":{"name":"Proceedings of the ACM on Management of Data","volume":"34 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136282514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Origin-Destination Travel Time Oracle for Map-based Services 基于地图服务的出发地旅行时间Oracle
Proceedings of the ACM on Management of Data Pub Date : 2023-11-13 DOI: 10.1145/3617337
Yan Lin, Huaiyu Wan, Jilin Hu, Shengnan Guo, Bin Yang, Youfang Lin, Christian S. Jensen
{"title":"Origin-Destination Travel Time Oracle for Map-based Services","authors":"Yan Lin, Huaiyu Wan, Jilin Hu, Shengnan Guo, Bin Yang, Youfang Lin, Christian S. Jensen","doi":"10.1145/3617337","DOIUrl":"https://doi.org/10.1145/3617337","url":null,"abstract":"Given an origin (O), a destination (D), and a departure time (T), an Origin-Destination (OD) travel time oracle~(ODT-Oracle) returns an estimate of the time it takes to travel from O to D when departing at T. ODT-Oracles serve important purposes in map-based services. To enable the construction of such oracles, we provide a travel-time estimation (TTE) solution that leverages historical trajectories to estimate time-varying travel times for OD pairs. The problem is complicated by the fact that multiple historical trajectories with different travel times may connect an OD pair, while trajectories may vary from one another. To solve the problem, it is crucial to remove outlier trajectories when doing travel time estimation for future queries. We propose a novel, two-stage framework called Diffusion-based Origin-destination Travel Time Estimation (DOT), that solves the problem. First, DOT employs a conditioned Pixelated Trajectories (PiT) denoiser that enables building a diffusion-based PiT inference process by learning correlations between OD pairs and historical trajectories. Specifically, given an OD pair and a departure time, we aim to infer a PiT. Next, DOT encompasses a Masked Vision Transformer~(MViT) that effectively and efficiently estimates a travel time based on the inferred PiT. We report on extensive experiments on two real-world datasets that offer evidence that DOT is capable of outperforming baseline methods in terms of accuracy, scalability, and explainability.","PeriodicalId":498157,"journal":{"name":"Proceedings of the ACM on Management of Data","volume":"33 8","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136282525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信