结果攻击:通过K-means算法对个人数据进行隐私泄露攻击

Q2 Engineering

Cyber-Physical Systems Pub Date : 2020-08-31 DOI:10.1080/23335777.2020.1811380

Sharath Yaji, Neelima Bayyapu

{"title":"结果攻击:通过K-means算法对个人数据进行隐私泄露攻击","authors":"Sharath Yaji, Neelima Bayyapu","doi":"10.1080/23335777.2020.1811380","DOIUrl":null,"url":null,"abstract":"ABSTRACT Protecting data privacy concerns the most significant challenge of the present era. This paper is an attempt to demonstrate how machine learning can be used by an attacker to compromise data privacy. To demonstrate, we have chosen an attack on handwritten signatures. The attacker utilizes available signatures for training and appends malicious signatures to be used in the testing process until he gets the desired result. The attacker manipulates the achieved result to perform the malicious attack. We propose, result attack to highlight the need for ensuring the secrecy of the genuine signature. An illustration is performed by applying the K-means algorithm to the MNIST dataset. Differential Privacy (DP) is adopted for defense discussion. The illustration of DP is produced by aggregating red or white noise to the MNIST dataset. Observation shows, the aggregation of noise to personal data successfully delivers defense against the result attack. We get the area under the receiver operating characteristic curve for the original dataset as 0.878719, original dataset vs aggregated red noise as 0.4999901, and original dataset vs white noise as 0.4448475. This concludes for the defense model, aggregating white noise is better than red noise, i.e. white noise aggregation is 11% better than red noise.","PeriodicalId":37058,"journal":{"name":"Cyber-Physical Systems","volume":"1 1","pages":"11 - 40"},"PeriodicalIF":0.0000,"publicationDate":"2020-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Result attack: a privacy breaching attack for personal data through K-means algorithm\",\"authors\":\"Sharath Yaji, Neelima Bayyapu\",\"doi\":\"10.1080/23335777.2020.1811380\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"ABSTRACT Protecting data privacy concerns the most significant challenge of the present era. This paper is an attempt to demonstrate how machine learning can be used by an attacker to compromise data privacy. To demonstrate, we have chosen an attack on handwritten signatures. The attacker utilizes available signatures for training and appends malicious signatures to be used in the testing process until he gets the desired result. The attacker manipulates the achieved result to perform the malicious attack. We propose, result attack to highlight the need for ensuring the secrecy of the genuine signature. An illustration is performed by applying the K-means algorithm to the MNIST dataset. Differential Privacy (DP) is adopted for defense discussion. The illustration of DP is produced by aggregating red or white noise to the MNIST dataset. Observation shows, the aggregation of noise to personal data successfully delivers defense against the result attack. We get the area under the receiver operating characteristic curve for the original dataset as 0.878719, original dataset vs aggregated red noise as 0.4999901, and original dataset vs white noise as 0.4448475. This concludes for the defense model, aggregating white noise is better than red noise, i.e. white noise aggregation is 11% better than red noise.\",\"PeriodicalId\":37058,\"journal\":{\"name\":\"Cyber-Physical Systems\",\"volume\":\"1 1\",\"pages\":\"11 - 40\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-08-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cyber-Physical Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1080/23335777.2020.1811380\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Engineering\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cyber-Physical Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/23335777.2020.1811380","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Engineering","Score":null,"Total":0}

引用次数: 1

摘要

保护数据隐私是当今时代最重大的挑战。本文试图展示攻击者如何使用机器学习来破坏数据隐私。为了证明这一点，我们选择了对手写签名的攻击。攻击者利用可用的签名进行训练，并附加恶意签名用于测试过程，直到获得所需的结果。攻击者操纵已实现的结果执行恶意攻击。我们提出了结果攻击，以强调对真实签名保密性的需要。通过将K-means算法应用于MNIST数据集来进行演示。采用差分隐私(DP)进行防御讨论。DP的说明是通过将红噪声或白噪声聚合到MNIST数据集来产生的。观察表明，噪声对个人数据的聚合成功地防御了结果攻击。我们得到原始数据集的接收者工作特征曲线下的面积为0.878719，原始数据集vs聚合红噪声为0.4999901，原始数据集vs白噪声为0.4448475。由此得出，对于防御模型，白噪声的聚合优于红噪声，即白噪声的聚合优于红噪声11%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Result attack: a privacy breaching attack for personal data through K-means algorithm

ABSTRACT Protecting data privacy concerns the most significant challenge of the present era. This paper is an attempt to demonstrate how machine learning can be used by an attacker to compromise data privacy. To demonstrate, we have chosen an attack on handwritten signatures. The attacker utilizes available signatures for training and appends malicious signatures to be used in the testing process until he gets the desired result. The attacker manipulates the achieved result to perform the malicious attack. We propose, result attack to highlight the need for ensuring the secrecy of the genuine signature. An illustration is performed by applying the K-means algorithm to the MNIST dataset. Differential Privacy (DP) is adopted for defense discussion. The illustration of DP is produced by aggregating red or white noise to the MNIST dataset. Observation shows, the aggregation of noise to personal data successfully delivers defense against the result attack. We get the area under the receiver operating characteristic curve for the original dataset as 0.878719, original dataset vs aggregated red noise as 0.4999901, and original dataset vs white noise as 0.4448475. This concludes for the defense model, aggregating white noise is better than red noise, i.e. white noise aggregation is 11% better than red noise.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Cyber-Physical Systems Engineering-Computational Mechanics

CiteScore

3.10

自引率

0.00%

发文量