Defending Against Model Inversion Attack by Adversarial Examples

Jing Wen, S. Yiu, L. Hui
{"title":"Defending Against Model Inversion Attack by Adversarial Examples","authors":"Jing Wen, S. Yiu, L. Hui","doi":"10.1109/CSR51186.2021.9527945","DOIUrl":null,"url":null,"abstract":"Model inversion (MI) attacks aim to infer and reconstruct the input data from the output of a neural network, which poses a severe threat to the privacy of input data. Inspired by adversarial examples, we propose defending against MI attacks by adding adversarial noise to the output. The critical challenge is finding a noise vector that maximizes the inversion error and introduces negligible utility loss to the target model. We propose an algorithm to craft such noise vectors, which also incorporates utility-loss constraints. Specifically, our algorithm takes advantage of the gradient of an inversion model we train to mimic the adversary and compute a noise vector to turn the output into an adversarial example that can maximize the reconstruction error of the inversion model. Then we apply a label modifier that keeps the label unchanged to achieve zero accuracy loss of the target model. Our defense does not tamper with the training process or need the private training dataset. Thus it can be easily applied to any current neural networks or APIs. We evaluate our method under both standard and adaptive attack settings. Our empirical results show our approach is effective against state-of-the-art MI attacks due to the transferability of adversarial examples and outperforms existing defenses. Furthermore, it causes more reconstruction errors while introducing zero accuracy loss and less distortion than existing defenses.","PeriodicalId":253300,"journal":{"name":"2021 IEEE International Conference on Cyber Security and Resilience (CSR)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Cyber Security and Resilience (CSR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSR51186.2021.9527945","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Model inversion (MI) attacks aim to infer and reconstruct the input data from the output of a neural network, which poses a severe threat to the privacy of input data. Inspired by adversarial examples, we propose defending against MI attacks by adding adversarial noise to the output. The critical challenge is finding a noise vector that maximizes the inversion error and introduces negligible utility loss to the target model. We propose an algorithm to craft such noise vectors, which also incorporates utility-loss constraints. Specifically, our algorithm takes advantage of the gradient of an inversion model we train to mimic the adversary and compute a noise vector to turn the output into an adversarial example that can maximize the reconstruction error of the inversion model. Then we apply a label modifier that keeps the label unchanged to achieve zero accuracy loss of the target model. Our defense does not tamper with the training process or need the private training dataset. Thus it can be easily applied to any current neural networks or APIs. We evaluate our method under both standard and adaptive attack settings. Our empirical results show our approach is effective against state-of-the-art MI attacks due to the transferability of adversarial examples and outperforms existing defenses. Furthermore, it causes more reconstruction errors while introducing zero accuracy loss and less distortion than existing defenses.
用对抗性示例防御模型反转攻击
模型反演(MI)攻击的目的是从神经网络的输出中推断和重构输入数据,这对输入数据的隐私性构成了严重的威胁。受对抗性示例的启发,我们提出通过在输出中添加对抗性噪声来防御MI攻击。关键的挑战是找到一个噪声向量,使反演误差最大化,并为目标模型引入可忽略不计的效用损失。我们提出了一种算法来制作这样的噪声向量,它也包含了效用损耗约束。具体来说,我们的算法利用了我们训练的模拟对手的反演模型的梯度,并计算噪声向量,将输出转换为可以最大化反演模型重建误差的对抗性示例。然后,我们应用标签修饰符,使标签保持不变,以实现目标模型的零精度损失。我们的防御不篡改训练过程,也不需要私人训练数据集。因此,它可以很容易地应用于任何当前的神经网络或api。我们在标准和自适应攻击设置下评估了我们的方法。我们的实证结果表明,由于对抗性示例的可转移性,我们的方法对最先进的MI攻击是有效的,并且优于现有的防御。此外,与现有的防御系统相比,它在引入零精度损失和更小失真的情况下导致了更多的重构误差。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信