大海捞针:构建逃避验证的神经网络

Árpád Berta, Gábor Danner, István Hegedüs, Márk Jelasity
{"title":"大海捞针:构建逃避验证的神经网络","authors":"Árpád Berta, Gábor Danner, István Hegedüs, Márk Jelasity","doi":"10.1145/3531536.3532966","DOIUrl":null,"url":null,"abstract":"Machine learning models are vulnerable to adversarial attacks, where a small, invisible, malicious perturbation of the input changes the predicted label. A large area of research is concerned with verification techniques that attempt to decide whether a given model has adversarial inputs close to a given benign input. Here, we show that current approaches to verification have a key vulnerability: we construct a model that is not robust but passes current verifiers. The idea is to insert artificial adversarial perturbations by adding a backdoor to a robust neural network model. In our construction, the adversarial input subspace that triggers the backdoor has a very small volume, and outside this subspace the gradient of the model is identical to that of the clean model. In other words, we seek to create a \"needle in a haystack\" search problem. For practical purposes, we also require that the adversarial samples be robust to JPEG compression. Large \"needle in the haystack\" problems are practically impossible to solve with any search algorithm. Formal verifiers can handle this in principle, but they do not scale up to real-world networks at the moment, and achieving this is a challenge because the verification problem is NP-complete. Our construction is based on training a hiding and a revealing network using deep steganography. Using the revealing network, we create a separate backdoor network and integrate it into the target network. We train our deep steganography networks over the CIFAR-10 dataset. We then evaluate our construction using state-of-the-art adversarial attacks and backdoor detectors over the CIFAR-10 and the ImageNet datasets. We made the code and models publicly available at https://github.com/szegedai/hiding-needles-in-a-haystack.","PeriodicalId":164949,"journal":{"name":"Proceedings of the 2022 ACM Workshop on Information Hiding and Multimedia Security","volume":"223 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Hiding Needles in a Haystack: Towards Constructing Neural Networks that Evade Verification\",\"authors\":\"Árpád Berta, Gábor Danner, István Hegedüs, Márk Jelasity\",\"doi\":\"10.1145/3531536.3532966\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Machine learning models are vulnerable to adversarial attacks, where a small, invisible, malicious perturbation of the input changes the predicted label. A large area of research is concerned with verification techniques that attempt to decide whether a given model has adversarial inputs close to a given benign input. Here, we show that current approaches to verification have a key vulnerability: we construct a model that is not robust but passes current verifiers. The idea is to insert artificial adversarial perturbations by adding a backdoor to a robust neural network model. In our construction, the adversarial input subspace that triggers the backdoor has a very small volume, and outside this subspace the gradient of the model is identical to that of the clean model. In other words, we seek to create a \\\"needle in a haystack\\\" search problem. For practical purposes, we also require that the adversarial samples be robust to JPEG compression. Large \\\"needle in the haystack\\\" problems are practically impossible to solve with any search algorithm. Formal verifiers can handle this in principle, but they do not scale up to real-world networks at the moment, and achieving this is a challenge because the verification problem is NP-complete. Our construction is based on training a hiding and a revealing network using deep steganography. Using the revealing network, we create a separate backdoor network and integrate it into the target network. We train our deep steganography networks over the CIFAR-10 dataset. We then evaluate our construction using state-of-the-art adversarial attacks and backdoor detectors over the CIFAR-10 and the ImageNet datasets. We made the code and models publicly available at https://github.com/szegedai/hiding-needles-in-a-haystack.\",\"PeriodicalId\":164949,\"journal\":{\"name\":\"Proceedings of the 2022 ACM Workshop on Information Hiding and Multimedia Security\",\"volume\":\"223 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2022 ACM Workshop on Information Hiding and Multimedia Security\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3531536.3532966\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 ACM Workshop on Information Hiding and Multimedia Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3531536.3532966","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

机器学习模型容易受到对抗性攻击,在这种攻击中,输入的一个小的、看不见的、恶意的扰动会改变预测的标签。一个很大的研究领域是与验证技术有关的,这些技术试图确定给定模型是否具有接近给定良性输入的敌对输入。在这里,我们展示了当前的验证方法有一个关键的弱点:我们构建了一个不健壮但通过当前验证者的模型。这个想法是通过在鲁棒神经网络模型中添加后门来插入人工的对抗性扰动。在我们的构造中,触发后门的对抗性输入子空间具有非常小的体积,并且在该子空间之外,模型的梯度与干净模型的梯度相同。换句话说,我们试图创造一个“大海捞针”的搜索问题。出于实际目的,我们还要求对抗性样本对JPEG压缩具有鲁棒性。大的“大海捞针”问题实际上是不可能用任何搜索算法解决的。正式的验证器原则上可以处理这个问题,但目前它们不能扩展到现实世界的网络,实现这一点是一个挑战,因为验证问题是np完备的。我们的构建是基于使用深度隐写术训练隐藏和显示网络。利用揭露网络,我们创建一个单独的后门网络,并将其整合到目标网络中。我们在CIFAR-10数据集上训练我们的深度隐写网络。然后,我们在CIFAR-10和ImageNet数据集上使用最先进的对抗性攻击和后门检测器来评估我们的构建。我们在https://github.com/szegedai/hiding-needles-in-a-haystack上公开了代码和模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Hiding Needles in a Haystack: Towards Constructing Neural Networks that Evade Verification
Machine learning models are vulnerable to adversarial attacks, where a small, invisible, malicious perturbation of the input changes the predicted label. A large area of research is concerned with verification techniques that attempt to decide whether a given model has adversarial inputs close to a given benign input. Here, we show that current approaches to verification have a key vulnerability: we construct a model that is not robust but passes current verifiers. The idea is to insert artificial adversarial perturbations by adding a backdoor to a robust neural network model. In our construction, the adversarial input subspace that triggers the backdoor has a very small volume, and outside this subspace the gradient of the model is identical to that of the clean model. In other words, we seek to create a "needle in a haystack" search problem. For practical purposes, we also require that the adversarial samples be robust to JPEG compression. Large "needle in the haystack" problems are practically impossible to solve with any search algorithm. Formal verifiers can handle this in principle, but they do not scale up to real-world networks at the moment, and achieving this is a challenge because the verification problem is NP-complete. Our construction is based on training a hiding and a revealing network using deep steganography. Using the revealing network, we create a separate backdoor network and integrate it into the target network. We train our deep steganography networks over the CIFAR-10 dataset. We then evaluate our construction using state-of-the-art adversarial attacks and backdoor detectors over the CIFAR-10 and the ImageNet datasets. We made the code and models publicly available at https://github.com/szegedai/hiding-needles-in-a-haystack.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信