生成目标对抗性攻击并评估其欺骗深度神经网络的有效性

Shivangi Gajjar, Avik Hati, Shruti Bhilare, Srimanta Mandal
{"title":"生成目标对抗性攻击并评估其欺骗深度神经网络的有效性","authors":"Shivangi Gajjar, Avik Hati, Shruti Bhilare, Srimanta Mandal","doi":"10.1109/SPCOM55316.2022.9840784","DOIUrl":null,"url":null,"abstract":"Deep neural network (DNN) models have gained popularity for most image classification problems. However, DNNs also have numerous vulnerable areas. These vulnerabilities can be exploited by an adversary to execute a successful adversarial attack, which is an algorithm to generate perturbed inputs that can fool a well-trained DNN. Among various existing adversarial attacks, DeepFool, a white-box untargeted attack is considered as one of the most reliable algorithms to compute adversarial perturbations. However, in some scenarios such as person recognition, adversary might want to carry out a targeted attack such that the input gets misclassified in a specific target class. Moreover, studies show that defense against a targeted attack is tougher than an untargeted one. Hence, generating a targeted adversarial example is desirable from an attacker’s perspective. In this paper, we propose ‘Targeted DeepFool’, which is based on computing a minimal amount of perturbation required to reach the target hyperplane. The proposed algorithm produces minimal amount of distortion for conventional image datasets: MNIST and CIFAR10. Further, Targeted DeepFool shows excellent performance in terms of adversarial success rate.","PeriodicalId":246982,"journal":{"name":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","volume":"251 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Generating Targeted Adversarial Attacks and Assessing their Effectiveness in Fooling Deep Neural Networks\",\"authors\":\"Shivangi Gajjar, Avik Hati, Shruti Bhilare, Srimanta Mandal\",\"doi\":\"10.1109/SPCOM55316.2022.9840784\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep neural network (DNN) models have gained popularity for most image classification problems. However, DNNs also have numerous vulnerable areas. These vulnerabilities can be exploited by an adversary to execute a successful adversarial attack, which is an algorithm to generate perturbed inputs that can fool a well-trained DNN. Among various existing adversarial attacks, DeepFool, a white-box untargeted attack is considered as one of the most reliable algorithms to compute adversarial perturbations. However, in some scenarios such as person recognition, adversary might want to carry out a targeted attack such that the input gets misclassified in a specific target class. Moreover, studies show that defense against a targeted attack is tougher than an untargeted one. Hence, generating a targeted adversarial example is desirable from an attacker’s perspective. In this paper, we propose ‘Targeted DeepFool’, which is based on computing a minimal amount of perturbation required to reach the target hyperplane. The proposed algorithm produces minimal amount of distortion for conventional image datasets: MNIST and CIFAR10. Further, Targeted DeepFool shows excellent performance in terms of adversarial success rate.\",\"PeriodicalId\":246982,\"journal\":{\"name\":\"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)\",\"volume\":\"251 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SPCOM55316.2022.9840784\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPCOM55316.2022.9840784","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

深度神经网络(DNN)模型在大多数图像分类问题中得到了广泛的应用。然而,深层神经网络也有许多脆弱的区域。这些漏洞可以被对手利用来执行成功的对抗性攻击,这是一种生成干扰输入的算法,可以欺骗训练有素的DNN。在现有的各种对抗性攻击中,DeepFool,白盒非目标攻击被认为是计算对抗性扰动最可靠的算法之一。然而,在某些场景中,例如人物识别,攻击者可能想要执行有针对性的攻击,以便将输入错误地分类为特定的目标类。此外,研究表明,防御有针对性的攻击比防御无针对性的攻击更困难。因此,从攻击者的角度来看,生成目标对抗性示例是可取的。在本文中,我们提出了“目标DeepFool”,它基于计算到达目标超平面所需的最小扰动。该算法对传统的MNIST和CIFAR10图像数据集产生最小的失真。此外,Targeted DeepFool在对抗成功率方面表现出色。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Generating Targeted Adversarial Attacks and Assessing their Effectiveness in Fooling Deep Neural Networks
Deep neural network (DNN) models have gained popularity for most image classification problems. However, DNNs also have numerous vulnerable areas. These vulnerabilities can be exploited by an adversary to execute a successful adversarial attack, which is an algorithm to generate perturbed inputs that can fool a well-trained DNN. Among various existing adversarial attacks, DeepFool, a white-box untargeted attack is considered as one of the most reliable algorithms to compute adversarial perturbations. However, in some scenarios such as person recognition, adversary might want to carry out a targeted attack such that the input gets misclassified in a specific target class. Moreover, studies show that defense against a targeted attack is tougher than an untargeted one. Hence, generating a targeted adversarial example is desirable from an attacker’s perspective. In this paper, we propose ‘Targeted DeepFool’, which is based on computing a minimal amount of perturbation required to reach the target hyperplane. The proposed algorithm produces minimal amount of distortion for conventional image datasets: MNIST and CIFAR10. Further, Targeted DeepFool shows excellent performance in terms of adversarial success rate.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信