Attack to Explain Deep Representation

M. Jalwana, Naveed Akhtar, Bennamoun, A. Mian
{"title":"Attack to Explain Deep Representation","authors":"M. Jalwana, Naveed Akhtar, Bennamoun, A. Mian","doi":"10.1109/cvpr42600.2020.00956","DOIUrl":null,"url":null,"abstract":"Deep visual models are susceptible to extremely low magnitude perturbations to input images. Though carefully crafted, the perturbation patterns generally appear noisy, yet they are able to perform controlled manipulation of model predictions. This observation is used to argue that deep representation is misaligned with human perception. This paper counter-argues and proposes the first attack on deep learning that aims at explaining the learned representation instead of fooling it. By extending the input domain of the manipulative signal and employing a model faithful channelling, we iteratively accumulate adversarial perturbations for a deep model. The accumulated signal gradually manifests itself as a collection of visually salient features of the target label (in model fooling), casting adversarial perturbations as primitive features of the target label. Our attack provides the first demonstration of systematically computing perturbations for adversarially non-robust classifiers that comprise salient visual features of objects. We leverage the model explaining character of our algorithm to perform image generation, inpainting and interactive image manipulation by attacking adversarially robust classifiers. The visually appealing results across these applications demonstrate the utility of our attack (and perturbations in general) beyond model fooling.","PeriodicalId":6715,"journal":{"name":"2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"22 1","pages":"9540-9549"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/cvpr42600.2020.00956","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

Abstract

Deep visual models are susceptible to extremely low magnitude perturbations to input images. Though carefully crafted, the perturbation patterns generally appear noisy, yet they are able to perform controlled manipulation of model predictions. This observation is used to argue that deep representation is misaligned with human perception. This paper counter-argues and proposes the first attack on deep learning that aims at explaining the learned representation instead of fooling it. By extending the input domain of the manipulative signal and employing a model faithful channelling, we iteratively accumulate adversarial perturbations for a deep model. The accumulated signal gradually manifests itself as a collection of visually salient features of the target label (in model fooling), casting adversarial perturbations as primitive features of the target label. Our attack provides the first demonstration of systematically computing perturbations for adversarially non-robust classifiers that comprise salient visual features of objects. We leverage the model explaining character of our algorithm to perform image generation, inpainting and interactive image manipulation by attacking adversarially robust classifiers. The visually appealing results across these applications demonstrate the utility of our attack (and perturbations in general) beyond model fooling.
攻击解释深度表示
深度视觉模型容易受到输入图像的极低量级的扰动。虽然经过精心制作,扰动模式通常看起来嘈杂,但它们能够对模型预测进行控制操作。这一观察结果被用来论证深度表征与人类感知不一致。本文反驳并提出了对深度学习的第一次攻击,旨在解释学习到的表示,而不是欺骗它。通过扩展操纵信号的输入域并采用模型忠实信道,我们迭代地积累了深度模型的对抗性扰动。累积的信号逐渐表现为目标标签的视觉显著特征的集合(在模型欺骗中),将对抗性扰动作为目标标签的原始特征。我们的攻击首次展示了系统地计算包含物体显著视觉特征的对抗非鲁棒分类器的扰动。我们利用模型来解释我们算法的特征,通过攻击对抗鲁棒分类器来执行图像生成、绘制和交互式图像操作。在这些应用程序中,视觉上吸引人的结果证明了我们的攻击(以及一般的扰动)在模型欺骗之外的效用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信