利用大数据的规模保护数据隐私:基于距离保持人工噪声和秘密矩阵变换的有效方案

2014 IEEE China Summit & International Conference on Signal and Information Processing (ChinaSIP) Pub Date : 2014-07-09 DOI:10.1109/ChinaSIP.2014.6889293

Xiaohua Li, Zifan Zhang

{"title":"利用大数据的规模保护数据隐私:基于距离保持人工噪声和秘密矩阵变换的有效方案","authors":"Xiaohua Li, Zifan Zhang","doi":"10.1109/ChinaSIP.2014.6889293","DOIUrl":null,"url":null,"abstract":"In this paper we show that the extensive results in blind/non-blind channel identification developed within the community of signal processing in communications can play an important role in guaranteeing big data privacy. It is widely believed that the sheer scale of big data makes most conventional data privacy techniques ineffective for big data. In contrast to this pessimistic common belief, we propose a scheme that exploits the sheer scale to guarantee privacy. This scheme uses jointly artificial noise and secret matrix transform to scramble the source data. Desirable data utility can be supported because the noise and the transform preserve some important geometric properties of the source data. With a comprehensive privacy analysis, we use the blind/non-blind channel identification theories to show that the secret transform matrix and the source data can not be estimated from the scrambled data. The artificial noise and the sheer scale of big data are critical for this purpose. Simulations of collaborative filtering are conducted to demonstrate the proposed scheme.","PeriodicalId":248977,"journal":{"name":"2014 IEEE China Summit & International Conference on Signal and Information Processing (ChinaSIP)","volume":"150 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Exploit the scale of big data for data privacy: An efficient scheme based on distance-preserving artificial noise and secret matrix transform\",\"authors\":\"Xiaohua Li, Zifan Zhang\",\"doi\":\"10.1109/ChinaSIP.2014.6889293\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper we show that the extensive results in blind/non-blind channel identification developed within the community of signal processing in communications can play an important role in guaranteeing big data privacy. It is widely believed that the sheer scale of big data makes most conventional data privacy techniques ineffective for big data. In contrast to this pessimistic common belief, we propose a scheme that exploits the sheer scale to guarantee privacy. This scheme uses jointly artificial noise and secret matrix transform to scramble the source data. Desirable data utility can be supported because the noise and the transform preserve some important geometric properties of the source data. With a comprehensive privacy analysis, we use the blind/non-blind channel identification theories to show that the secret transform matrix and the source data can not be estimated from the scrambled data. The artificial noise and the sheer scale of big data are critical for this purpose. Simulations of collaborative filtering are conducted to demonstrate the proposed scheme.\",\"PeriodicalId\":248977,\"journal\":{\"name\":\"2014 IEEE China Summit & International Conference on Signal and Information Processing (ChinaSIP)\",\"volume\":\"150 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-07-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 IEEE China Summit & International Conference on Signal and Information Processing (ChinaSIP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ChinaSIP.2014.6889293\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE China Summit & International Conference on Signal and Information Processing (ChinaSIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ChinaSIP.2014.6889293","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

在本文中，我们表明在通信信号处理社区中发展的盲/非盲信道识别的广泛结果可以在保证大数据隐私方面发挥重要作用。人们普遍认为，大数据的庞大规模使得大多数传统的数据隐私技术对大数据无效。与这种悲观的普遍看法相反，我们提出了一种利用绝对规模来保证隐私的方案。该方案采用人工噪声和秘密矩阵变换相结合的方法对源数据进行置乱。由于噪声和变换保留了源数据的一些重要几何属性，因此可以支持理想的数据效用。在进行了全面的隐私分析的基础上，利用盲/非盲信道识别理论，证明了不能从扰码数据中估计出秘密变换矩阵和源数据。人为噪音和大数据的庞大规模对实现这一目标至关重要。通过仿真验证了协同滤波的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Exploit the scale of big data for data privacy: An efficient scheme based on distance-preserving artificial noise and secret matrix transform

In this paper we show that the extensive results in blind/non-blind channel identification developed within the community of signal processing in communications can play an important role in guaranteeing big data privacy. It is widely believed that the sheer scale of big data makes most conventional data privacy techniques ineffective for big data. In contrast to this pessimistic common belief, we propose a scheme that exploits the sheer scale to guarantee privacy. This scheme uses jointly artificial noise and secret matrix transform to scramble the source data. Desirable data utility can be supported because the noise and the transform preserve some important geometric properties of the source data. With a comprehensive privacy analysis, we use the blind/non-blind channel identification theories to show that the secret transform matrix and the source data can not be estimated from the scrambled data. The artificial noise and the sheer scale of big data are critical for this purpose. Simulations of collaborative filtering are conducted to demonstrate the proposed scheme.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2014 IEEE China Summit & International Conference on Signal and Information Processing (ChinaSIP)

自引率

0.00%

发文量