基于条件生成模型反演的对抗鲁棒分类

Mitra Alirezaei, T. Tasdizen
{"title":"基于条件生成模型反演的对抗鲁棒分类","authors":"Mitra Alirezaei, T. Tasdizen","doi":"10.1109/ICMLC56445.2022.9941288","DOIUrl":null,"url":null,"abstract":"Most adversarial attack defense methods rely on obfuscating gradients. These methods are easily circumvented by attacks which either do not use the gradient or by attacks which approximate and use the corrected gradient Defenses that do not obfuscate gradients such as adversarial training exist, but these approaches generally make assumptions about the attack such as its magnitude. We propose a classification model that does not obfuscate gradients and is robust by construction against black-box attacks without assuming prior knowledge about the attack. Our method casts classification as an optimization problem where we \"invert\" a conditional generator trained on unperturbed, natural images to find the class that generates the closest sample to the query image. We hypothesize that a potential source of brittleness against adversarial attacks is the high-to-low-dimensional nature of feed-forward classifiers. On the other hand, a generative model is typically a low-to-high-dimensional mapping. Since the range of images that can be generated by the model for a given class is limited to its learned manifold, the \"inversion\" process cannot generate images that are arbitrarily close to adversarial examples leading to a robust model by construction. While the method is related to Defense-GAN, the use of a conditional generative model and inversion in our model instead of the feed-forward classifier is a critical difference. Unlike Defense-GAN, we show that our method does not obfuscate gradients. We demonstrate that our model is extremely robust against black-box attacks and does not depend on previous knowledge about the attack strength.","PeriodicalId":117829,"journal":{"name":"2022 International Conference on Machine Learning and Cybernetics (ICMLC)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Adversarial Robust Classification by Conditional Generative Model Inversion\",\"authors\":\"Mitra Alirezaei, T. Tasdizen\",\"doi\":\"10.1109/ICMLC56445.2022.9941288\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Most adversarial attack defense methods rely on obfuscating gradients. These methods are easily circumvented by attacks which either do not use the gradient or by attacks which approximate and use the corrected gradient Defenses that do not obfuscate gradients such as adversarial training exist, but these approaches generally make assumptions about the attack such as its magnitude. We propose a classification model that does not obfuscate gradients and is robust by construction against black-box attacks without assuming prior knowledge about the attack. Our method casts classification as an optimization problem where we \\\"invert\\\" a conditional generator trained on unperturbed, natural images to find the class that generates the closest sample to the query image. We hypothesize that a potential source of brittleness against adversarial attacks is the high-to-low-dimensional nature of feed-forward classifiers. On the other hand, a generative model is typically a low-to-high-dimensional mapping. Since the range of images that can be generated by the model for a given class is limited to its learned manifold, the \\\"inversion\\\" process cannot generate images that are arbitrarily close to adversarial examples leading to a robust model by construction. While the method is related to Defense-GAN, the use of a conditional generative model and inversion in our model instead of the feed-forward classifier is a critical difference. Unlike Defense-GAN, we show that our method does not obfuscate gradients. We demonstrate that our model is extremely robust against black-box attacks and does not depend on previous knowledge about the attack strength.\",\"PeriodicalId\":117829,\"journal\":{\"name\":\"2022 International Conference on Machine Learning and Cybernetics (ICMLC)\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Conference on Machine Learning and Cybernetics (ICMLC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLC56445.2022.9941288\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Machine Learning and Cybernetics (ICMLC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLC56445.2022.9941288","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

大多数对抗性攻击防御方法依赖于混淆梯度。这些方法很容易被攻击绕过,这些攻击要么不使用梯度,要么接近并使用校正的梯度防御(不混淆梯度),如对抗性训练存在,但这些方法通常对攻击进行假设,如其大小。我们提出了一种不混淆梯度的分类模型,并且在不假设攻击的先验知识的情况下,通过构造针对黑盒攻击的鲁棒性。我们的方法将分类转换为一个优化问题,我们“反转”一个在未受干扰的自然图像上训练的条件生成器,以找到生成最接近查询图像样本的类。我们假设,对抗对抗性攻击的脆弱性的潜在来源是前馈分类器的高到低维性质。另一方面,生成模型通常是低维到高维的映射。由于模型可以为给定类生成的图像范围仅限于其学习到的流形,因此“反转”过程不能生成任意接近对抗示例的图像,从而通过构建鲁棒模型。虽然该方法与Defense-GAN相关,但在我们的模型中使用条件生成模型和反转而不是前馈分类器是一个关键的区别。与Defense-GAN不同,我们表明我们的方法不会混淆梯度。我们证明了我们的模型对黑盒攻击具有极强的鲁棒性,并且不依赖于先前关于攻击强度的知识。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Adversarial Robust Classification by Conditional Generative Model Inversion
Most adversarial attack defense methods rely on obfuscating gradients. These methods are easily circumvented by attacks which either do not use the gradient or by attacks which approximate and use the corrected gradient Defenses that do not obfuscate gradients such as adversarial training exist, but these approaches generally make assumptions about the attack such as its magnitude. We propose a classification model that does not obfuscate gradients and is robust by construction against black-box attacks without assuming prior knowledge about the attack. Our method casts classification as an optimization problem where we "invert" a conditional generator trained on unperturbed, natural images to find the class that generates the closest sample to the query image. We hypothesize that a potential source of brittleness against adversarial attacks is the high-to-low-dimensional nature of feed-forward classifiers. On the other hand, a generative model is typically a low-to-high-dimensional mapping. Since the range of images that can be generated by the model for a given class is limited to its learned manifold, the "inversion" process cannot generate images that are arbitrarily close to adversarial examples leading to a robust model by construction. While the method is related to Defense-GAN, the use of a conditional generative model and inversion in our model instead of the feed-forward classifier is a critical difference. Unlike Defense-GAN, we show that our method does not obfuscate gradients. We demonstrate that our model is extremely robust against black-box attacks and does not depend on previous knowledge about the attack strength.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信