Private approximate nearest neighbor search for on-chain data based on locality-sensitive hashing

IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS
Siyuan Shang , Xuehui Du , Xiaohan Wang, Aodi Liu
{"title":"Private approximate nearest neighbor search for on-chain data based on locality-sensitive hashing","authors":"Siyuan Shang ,&nbsp;Xuehui Du ,&nbsp;Xiaohan Wang,&nbsp;Aodi Liu","doi":"10.1016/j.future.2024.107586","DOIUrl":null,"url":null,"abstract":"<div><div>Blockchain manages data with immutability, decentralization and traceability, offering new solutions for traditional information systems and greatly facilitating data sharing. However, on-chain data query still faces challenges such as low efficiency and difficulty in privacy protection. We propose a private Approximate Nearest Neighbor (ANN) search method for on-chain data based on Locality-Sensitive Hashing (LSH), which mainly includes two steps: query initialization and query implementation. In query initialization, the data management node builds hash tables for on-chain data through improved LSH, which are encrypted and stored on the blockchain using attribute-based encryption. In query implementation, node with correct privileges utilizes random smart contracts to query on-chain data privately by distributed point function and a privacy protection technique called oblivious masking. To validate the effectiveness of this method, we compare the performance with two ANN search algorithms, the query time is reduced by 57% and 59.2%, the average recall is increased by 4.5% and 2%, the average precision is increased by 7.7% and 6.9%, the average F1-score is increased by 6% and 4.3%, the average initialization time is reduced by 34 times and 122 times, respectively. We also compare the performance with private ANN search methods using homomorphic encryption, differential privacy and secure multi-party computation. The results show that our method can reduce the query time by several orders of magnitude, which is more applicable to the blockchain environment. To the best of our knowledge, this is the first private ANN search method for on-chain data, which consider the query efficiency and privacy protection, achieving efficient, accurate, and private data query.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"164 ","pages":"Article 107586"},"PeriodicalIF":6.2000,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X24005508","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Blockchain manages data with immutability, decentralization and traceability, offering new solutions for traditional information systems and greatly facilitating data sharing. However, on-chain data query still faces challenges such as low efficiency and difficulty in privacy protection. We propose a private Approximate Nearest Neighbor (ANN) search method for on-chain data based on Locality-Sensitive Hashing (LSH), which mainly includes two steps: query initialization and query implementation. In query initialization, the data management node builds hash tables for on-chain data through improved LSH, which are encrypted and stored on the blockchain using attribute-based encryption. In query implementation, node with correct privileges utilizes random smart contracts to query on-chain data privately by distributed point function and a privacy protection technique called oblivious masking. To validate the effectiveness of this method, we compare the performance with two ANN search algorithms, the query time is reduced by 57% and 59.2%, the average recall is increased by 4.5% and 2%, the average precision is increased by 7.7% and 6.9%, the average F1-score is increased by 6% and 4.3%, the average initialization time is reduced by 34 times and 122 times, respectively. We also compare the performance with private ANN search methods using homomorphic encryption, differential privacy and secure multi-party computation. The results show that our method can reduce the query time by several orders of magnitude, which is more applicable to the blockchain environment. To the best of our knowledge, this is the first private ANN search method for on-chain data, which consider the query efficiency and privacy protection, achieving efficient, accurate, and private data query.
基于位置敏感哈希算法的链上数据私有近似近邻搜索
区块链管理数据具有不可篡改性、去中心化和可追溯性,为传统信息系统提供了新的解决方案,极大地促进了数据共享。然而,链上数据查询仍面临效率低、隐私保护难等挑战。我们提出了一种基于位置敏感散列(LSH)的链上数据私有近似近邻(ANN)搜索方法,主要包括查询初始化和查询实现两个步骤。在查询初始化中,数据管理节点通过改进的 LSH 为链上数据建立哈希表,并使用基于属性的加密技术将哈希表加密后存储在区块链上。在查询执行过程中,拥有正确权限的节点利用随机智能合约,通过分布式点函数和一种称为遗忘掩码的隐私保护技术,私下查询链上数据。为了验证这种方法的有效性,我们将其与两种 ANN 搜索算法进行了性能比较,结果显示,查询时间分别缩短了 57% 和 59.2%,平均召回率分别提高了 4.5% 和 2%,平均精度分别提高了 7.7% 和 6.9%,平均 F1 分数分别提高了 6% 和 4.3%,平均初始化时间分别缩短了 34 倍和 122 倍。我们还比较了使用同态加密、差分隐私和安全多方计算的私有 ANN 搜索方法的性能。结果表明,我们的方法可以将查询时间缩短几个数量级,更适用于区块链环境。据我们所知,这是第一种考虑查询效率和隐私保护的链上数据私有 ANN 搜索方法,实现了高效、准确和私有的数据查询。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
19.90
自引率
2.70%
发文量
376
审稿时长
10.6 months
期刊介绍: Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications. Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration. Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信