Information Leakage in Encrypted Deduplication via Frequency Analysis

Jingwei Li, Chuan Qin, P. Lee, Xiaosong Zhang
{"title":"Information Leakage in Encrypted Deduplication via Frequency Analysis","authors":"Jingwei Li, Chuan Qin, P. Lee, Xiaosong Zhang","doi":"10.1109/DSN.2017.28","DOIUrl":null,"url":null,"abstract":"Encrypted deduplication seamlessly combines encryption and deduplication to simultaneously achieve both data security and storage efficiency. State-of-the-art encrypted deduplication systems mostly adopt a deterministic encryption approach that encrypts each plaintext chunk with a key derived from the content of the chunk itself, so that identical plaintext chunks are always encrypted into identical ciphertext chunks for deduplication. However, such deterministic encryption inherently reveals the underlying frequency distribution of the original plaintext chunks. This allows an adversary to launch frequency analysis against the resulting ciphertext chunks, and ultimately infer the content of the original plaintext chunks. In this paper, we study how frequency analysis practically affects information leakage in encrypted deduplication storage, from both attack and defense perspectives. We first propose a new inference attack that exploits chunk locality to increase the coverage of inferred chunks. We conduct trace-driven evaluation on both real-world and synthetic datasets, and show that the new inference attack can infer a significant fraction of plaintext chunks under backup workloads. To protect against frequency analysis, we borrow the idea of existing performance-driven deduplication approaches and consider an encryption scheme called MinHash encryption, which disturbs the frequency rank of ciphertext chunks by encrypting some identical plaintext chunks into multiple distinct ciphertext chunks. Our trace-driven evaluation shows that MinHash encryption effectively mitigates the inference attack, while maintaining high storage efficiency.","PeriodicalId":426928,"journal":{"name":"2017 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)","volume":"173 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSN.2017.28","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Encrypted deduplication seamlessly combines encryption and deduplication to simultaneously achieve both data security and storage efficiency. State-of-the-art encrypted deduplication systems mostly adopt a deterministic encryption approach that encrypts each plaintext chunk with a key derived from the content of the chunk itself, so that identical plaintext chunks are always encrypted into identical ciphertext chunks for deduplication. However, such deterministic encryption inherently reveals the underlying frequency distribution of the original plaintext chunks. This allows an adversary to launch frequency analysis against the resulting ciphertext chunks, and ultimately infer the content of the original plaintext chunks. In this paper, we study how frequency analysis practically affects information leakage in encrypted deduplication storage, from both attack and defense perspectives. We first propose a new inference attack that exploits chunk locality to increase the coverage of inferred chunks. We conduct trace-driven evaluation on both real-world and synthetic datasets, and show that the new inference attack can infer a significant fraction of plaintext chunks under backup workloads. To protect against frequency analysis, we borrow the idea of existing performance-driven deduplication approaches and consider an encryption scheme called MinHash encryption, which disturbs the frequency rank of ciphertext chunks by encrypting some identical plaintext chunks into multiple distinct ciphertext chunks. Our trace-driven evaluation shows that MinHash encryption effectively mitigates the inference attack, while maintaining high storage efficiency.
基于频率分析的加密重复数据删除中的信息泄露
加密重复数据删除将加密和重复数据删除无缝结合,同时实现数据安全性和存储效率。最先进的加密重复数据删除系统大多采用确定性加密方法,使用从数据块本身的内容派生的密钥对每个明文块进行加密,以便始终将相同的明文块加密为相同的密文块进行重复数据删除。然而,这种确定性加密本质上揭示了原始明文块的底层频率分布。这允许攻击者对生成的密文块进行频率分析,并最终推断出原始明文块的内容。本文从攻击和防御两个角度研究了频率分析对重复数据删除加密存储中信息泄漏的实际影响。我们首先提出了一种新的推理攻击,利用块局部性来增加推断块的覆盖范围。我们对真实世界和合成数据集进行了跟踪驱动的评估,并表明新的推理攻击可以在备份工作负载下推断出相当一部分明文块。为了防止频率分析,我们借用了现有的性能驱动的重复数据删除方法的思想,并考虑了一种称为MinHash加密的加密方案,该方案通过将一些相同的明文块加密成多个不同的密文块来干扰密文块的频率等级。我们的跟踪驱动评估表明,MinHash加密有效地减轻了推理攻击,同时保持了较高的存储效率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信