在加密的云数据上实现高效和准确的语义搜索

IF 6.8 1区 计算机科学 0 COMPUTER SCIENCE, INFORMATION SYSTEMS
Zixin Tang , Haihui Fan , Xiaoyan Gu , Jiang Zhou , Hui Ma , Athanasios V. Vasilakos , Bo Li
{"title":"在加密的云数据上实现高效和准确的语义搜索","authors":"Zixin Tang ,&nbsp;Haihui Fan ,&nbsp;Xiaoyan Gu ,&nbsp;Jiang Zhou ,&nbsp;Hui Ma ,&nbsp;Athanasios V. Vasilakos ,&nbsp;Bo Li","doi":"10.1016/j.ins.2025.122437","DOIUrl":null,"url":null,"abstract":"<div><div>The privacy and security of cloud data have drawn much attention, leading to more data owners outsourcing encrypted data. However, the common practice of encryption can reduce data searchability. Semantic searchable encryption aims to support flexible queries over encrypted data and achieve efficient search while ensuring that search results match the user's search intent. Although semantic searchable encryption schemes have made progress, they still have limitations in properly balancing accuracy, efficiency, and security. In this paper, we propose a novel <strong>C</strong>ontext-<strong>E</strong>nhanced <strong>S</strong>emantic <strong>S</strong>earchable <strong>E</strong>ncryption (<strong>CESSE</strong>) scheme to achieve accurate and highly efficient secure semantic search over encrypted cloud data. To achieve it, we first adopt a context-enhanced pre-trained model component to mine the relevance between queries and documents by contrastive learning and obtain context-enhanced vector representations to improve search accuracy. Then, to ensure privacy protection, we utilize an optimized asymmetric scalar-product-preserving encryption (optimized ASPE) algorithm to encrypt vectors before outsourcing to the cloud. Additionally, we construct the approximate nearest neighbor (ANN) index to accelerate vector searching. At last, we give a formal definition of security and theoretically prove the safety of our scheme under a more practical threat model. Extensive experiments demonstrate that the CESSE outperforms state-of-the-art baselines with better accuracy and efficiency.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"719 ","pages":"Article 122437"},"PeriodicalIF":6.8000,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enabling efficient and accurate semantic search over encrypted cloud data\",\"authors\":\"Zixin Tang ,&nbsp;Haihui Fan ,&nbsp;Xiaoyan Gu ,&nbsp;Jiang Zhou ,&nbsp;Hui Ma ,&nbsp;Athanasios V. Vasilakos ,&nbsp;Bo Li\",\"doi\":\"10.1016/j.ins.2025.122437\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The privacy and security of cloud data have drawn much attention, leading to more data owners outsourcing encrypted data. However, the common practice of encryption can reduce data searchability. Semantic searchable encryption aims to support flexible queries over encrypted data and achieve efficient search while ensuring that search results match the user's search intent. Although semantic searchable encryption schemes have made progress, they still have limitations in properly balancing accuracy, efficiency, and security. In this paper, we propose a novel <strong>C</strong>ontext-<strong>E</strong>nhanced <strong>S</strong>emantic <strong>S</strong>earchable <strong>E</strong>ncryption (<strong>CESSE</strong>) scheme to achieve accurate and highly efficient secure semantic search over encrypted cloud data. To achieve it, we first adopt a context-enhanced pre-trained model component to mine the relevance between queries and documents by contrastive learning and obtain context-enhanced vector representations to improve search accuracy. Then, to ensure privacy protection, we utilize an optimized asymmetric scalar-product-preserving encryption (optimized ASPE) algorithm to encrypt vectors before outsourcing to the cloud. Additionally, we construct the approximate nearest neighbor (ANN) index to accelerate vector searching. At last, we give a formal definition of security and theoretically prove the safety of our scheme under a more practical threat model. Extensive experiments demonstrate that the CESSE outperforms state-of-the-art baselines with better accuracy and efficiency.</div></div>\",\"PeriodicalId\":51063,\"journal\":{\"name\":\"Information Sciences\",\"volume\":\"719 \",\"pages\":\"Article 122437\"},\"PeriodicalIF\":6.8000,\"publicationDate\":\"2025-06-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Sciences\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0020025525005699\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"0\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Sciences","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0020025525005699","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

云数据的隐私性和安全性备受关注,导致越来越多的数据所有者将加密数据外包。然而,常见的加密做法会降低数据的可搜索性。语义可搜索加密旨在支持对加密数据的灵活查询,并在确保搜索结果符合用户搜索意图的同时实现高效搜索。尽管语义可搜索加密方案已经取得了一定的进展,但它们在正确平衡准确性、效率和安全性方面仍然存在局限性。在本文中,我们提出了一种新的上下文增强语义可搜索加密(CESSE)方案,以实现对加密云数据的准确和高效的安全语义搜索。为了实现这一目标,我们首先采用上下文增强的预训练模型组件,通过对比学习挖掘查询和文档之间的相关性,并获得上下文增强的向量表示来提高搜索精度。然后,为了确保隐私保护,我们利用优化的非对称标量积保留加密(优化的ASPE)算法在外包给云之前对向量进行加密。此外,我们构造了近似最近邻(ANN)索引来加速向量搜索。最后给出了安全的形式化定义,并在更实际的威胁模型下从理论上证明了该方案的安全性。大量的实验表明,CESSE以更高的精度和效率优于最先进的基线。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Enabling efficient and accurate semantic search over encrypted cloud data
The privacy and security of cloud data have drawn much attention, leading to more data owners outsourcing encrypted data. However, the common practice of encryption can reduce data searchability. Semantic searchable encryption aims to support flexible queries over encrypted data and achieve efficient search while ensuring that search results match the user's search intent. Although semantic searchable encryption schemes have made progress, they still have limitations in properly balancing accuracy, efficiency, and security. In this paper, we propose a novel Context-Enhanced Semantic Searchable Encryption (CESSE) scheme to achieve accurate and highly efficient secure semantic search over encrypted cloud data. To achieve it, we first adopt a context-enhanced pre-trained model component to mine the relevance between queries and documents by contrastive learning and obtain context-enhanced vector representations to improve search accuracy. Then, to ensure privacy protection, we utilize an optimized asymmetric scalar-product-preserving encryption (optimized ASPE) algorithm to encrypt vectors before outsourcing to the cloud. Additionally, we construct the approximate nearest neighbor (ANN) index to accelerate vector searching. At last, we give a formal definition of security and theoretically prove the safety of our scheme under a more practical threat model. Extensive experiments demonstrate that the CESSE outperforms state-of-the-art baselines with better accuracy and efficiency.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Information Sciences
Information Sciences 工程技术-计算机:信息系统
CiteScore
14.00
自引率
17.30%
发文量
1322
审稿时长
10.4 months
期刊介绍: Informatics and Computer Science Intelligent Systems Applications is an esteemed international journal that focuses on publishing original and creative research findings in the field of information sciences. We also feature a limited number of timely tutorial and surveying contributions. Our journal aims to cater to a diverse audience, including researchers, developers, managers, strategic planners, graduate students, and anyone interested in staying up-to-date with cutting-edge research in information science, knowledge engineering, and intelligent systems. While readers are expected to share a common interest in information science, they come from varying backgrounds such as engineering, mathematics, statistics, physics, computer science, cell biology, molecular biology, management science, cognitive science, neurobiology, behavioral sciences, and biochemistry.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信