{"title":"CloudRGK: Towards Private Similarity Measurement Between Graphs on the Cloud","authors":"Linxiao Yu;Jun Tao;Yifan Xu;Haotian Wang","doi":"10.1109/TKDE.2025.3529949","DOIUrl":null,"url":null,"abstract":"Graph kernels are a significant class of tools for measuring the similarity of graph data, which is the basis of a wide range of graph learning methods. However, graph kernels often suffer from high computing overhead. With the shining of cloud computing, it is desirable to transfer the computing burden to the server with abundant computing resources to reduce the cost of local machines. Nonetheless, under the honest-but-curious cloud assumption, the server may peek at the data, raising privacy concerns. To eliminate the risk of data privacy leakage, we propose CloudRGK to securely perform Random walk Graph Kernel(RGK), one of the most well-known graph kernels, on the cloud. We first prove that the edge- and vertex-labeled graphs could be transformed into an equivalent matrix representation. Afterward, we prove that the cloud could perform the core operations in RGK on the encrypted graphs without feature information loss. Evaluations of the real-world graph data demonstrate that our strategy significantly reduces the overhead of the local party to perform RGK without performance degradation. Meanwhile, it introduces only a small amount of extra computation cost. To the best of our knowledge, it is the first work towards private graph kernel computation on the cloud.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 4","pages":"1688-1701"},"PeriodicalIF":8.9000,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10843145/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Graph kernels are a significant class of tools for measuring the similarity of graph data, which is the basis of a wide range of graph learning methods. However, graph kernels often suffer from high computing overhead. With the shining of cloud computing, it is desirable to transfer the computing burden to the server with abundant computing resources to reduce the cost of local machines. Nonetheless, under the honest-but-curious cloud assumption, the server may peek at the data, raising privacy concerns. To eliminate the risk of data privacy leakage, we propose CloudRGK to securely perform Random walk Graph Kernel(RGK), one of the most well-known graph kernels, on the cloud. We first prove that the edge- and vertex-labeled graphs could be transformed into an equivalent matrix representation. Afterward, we prove that the cloud could perform the core operations in RGK on the encrypted graphs without feature information loss. Evaluations of the real-world graph data demonstrate that our strategy significantly reduces the overhead of the local party to perform RGK without performance degradation. Meanwhile, it introduces only a small amount of extra computation cost. To the best of our knowledge, it is the first work towards private graph kernel computation on the cloud.
期刊介绍:
The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.