多深度神经网络逃避攻击的优先级对抗实例

2019 International Conference on Artificial Intelligence in Information and Communication (ICAIIC) Pub Date : 2019-02-13 DOI:10.1109/ICAIIC.2019.8669034

Hyun Kwon, H. Yoon, D. Choi

{"title":"多深度神经网络逃避攻击的优先级对抗实例","authors":"Hyun Kwon, H. Yoon, D. Choi","doi":"10.1109/ICAIIC.2019.8669034","DOIUrl":null,"url":null,"abstract":"Deep neural networks (DNNs) provide superior per-formance on machine learning tasks such as image recognition, speech recognition, pattern recognition, and intrusion detection. However, an adversarial example created by adding a little noise to the original data can lead to misclassification by the DNN, and the human eye cannot detect the difference from the original data. For example, if an attacker generates a modified left-turn road sign to be incorrectly categorized by a DNN, an autonomous vehicle with the DNN will incorrect classify the modified left-turn road sign as a right-turn sign, whereas a human will correctly classify the modified sign as a left-turn sign. Such an adversarial example is a serious threat to a DNN. Recently, a multi-target adversarial example was introduced that causes misclassification by several models within each target class using a single modified image. However, it has the vulnerability that as the number of target models increases, the overall attack success rate is reduced. Therefore, if there are several models that the attacker wishes to target, the attacker needs to control the attack success rate for each model by considering the attack priority for each model. In this paper, we propose a priority adversarial example that considers the attack priority for each model in cases targeting several models. The proposed method controls the attack success rate for each model by adjusting the weight of the attack function in the generation process, while maintaining minimum distortion. We used Tensorflow, a widely used machine learning library, and MNIST as the dataset. Experimental results show that the proposed method can control the attack success rate for each model by considering the attack priority of each model while maintaining minimum distortion (on average 3.95 and 2.45 in targeted and untargeted attacks, respectively).","PeriodicalId":273383,"journal":{"name":"2019 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)","volume":"329 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Priority Adversarial Example in Evasion Attack on Multiple Deep Neural Networks\",\"authors\":\"Hyun Kwon, H. Yoon, D. Choi\",\"doi\":\"10.1109/ICAIIC.2019.8669034\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep neural networks (DNNs) provide superior per-formance on machine learning tasks such as image recognition, speech recognition, pattern recognition, and intrusion detection. However, an adversarial example created by adding a little noise to the original data can lead to misclassification by the DNN, and the human eye cannot detect the difference from the original data. For example, if an attacker generates a modified left-turn road sign to be incorrectly categorized by a DNN, an autonomous vehicle with the DNN will incorrect classify the modified left-turn road sign as a right-turn sign, whereas a human will correctly classify the modified sign as a left-turn sign. Such an adversarial example is a serious threat to a DNN. Recently, a multi-target adversarial example was introduced that causes misclassification by several models within each target class using a single modified image. However, it has the vulnerability that as the number of target models increases, the overall attack success rate is reduced. Therefore, if there are several models that the attacker wishes to target, the attacker needs to control the attack success rate for each model by considering the attack priority for each model. In this paper, we propose a priority adversarial example that considers the attack priority for each model in cases targeting several models. The proposed method controls the attack success rate for each model by adjusting the weight of the attack function in the generation process, while maintaining minimum distortion. We used Tensorflow, a widely used machine learning library, and MNIST as the dataset. Experimental results show that the proposed method can control the attack success rate for each model by considering the attack priority of each model while maintaining minimum distortion (on average 3.95 and 2.45 in targeted and untargeted attacks, respectively).\",\"PeriodicalId\":273383,\"journal\":{\"name\":\"2019 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)\",\"volume\":\"329 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-02-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICAIIC.2019.8669034\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAIIC.2019.8669034","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

深度神经网络(dnn)在图像识别、语音识别、模式识别和入侵检测等机器学习任务上提供了卓越的性能。然而，通过在原始数据中添加少量噪声创建的对抗性示例可能导致DNN的错误分类，并且人眼无法检测到与原始数据的差异。例如，如果攻击者生成了一个修改后的左转标志，并被DNN错误地分类，那么具有DNN的自动驾驶汽车将错误地将修改后的左转标志分类为右转标志，而人类将正确地将修改后的左转标志分类为左转标志。这样一个对抗性的例子对深度神经网络是一个严重的威胁。最近，介绍了一种多目标对抗示例，该示例使用单个修改后的图像在每个目标类别中引起多个模型的误分类。但其存在随着目标模型数量的增加，整体攻击成功率降低的漏洞。因此，如果攻击者希望攻击的模型有多个，那么攻击者需要通过考虑每个模型的攻击优先级来控制每个模型的攻击成功率。在本文中，我们提出了一个优先级对抗示例，该示例在针对多个模型的情况下考虑每个模型的攻击优先级。该方法通过在生成过程中调整攻击函数的权重来控制每个模型的攻击成功率，同时保持最小的失真。我们使用Tensorflow(一个广泛使用的机器学习库)和MNIST作为数据集。实验结果表明，该方法可以在保证最小失真的同时，综合考虑各模型的攻击优先级，控制各模型的攻击成功率(目标攻击和非目标攻击的平均失真率分别为3.95和2.45)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Priority Adversarial Example in Evasion Attack on Multiple Deep Neural Networks

Deep neural networks (DNNs) provide superior per-formance on machine learning tasks such as image recognition, speech recognition, pattern recognition, and intrusion detection. However, an adversarial example created by adding a little noise to the original data can lead to misclassification by the DNN, and the human eye cannot detect the difference from the original data. For example, if an attacker generates a modified left-turn road sign to be incorrectly categorized by a DNN, an autonomous vehicle with the DNN will incorrect classify the modified left-turn road sign as a right-turn sign, whereas a human will correctly classify the modified sign as a left-turn sign. Such an adversarial example is a serious threat to a DNN. Recently, a multi-target adversarial example was introduced that causes misclassification by several models within each target class using a single modified image. However, it has the vulnerability that as the number of target models increases, the overall attack success rate is reduced. Therefore, if there are several models that the attacker wishes to target, the attacker needs to control the attack success rate for each model by considering the attack priority for each model. In this paper, we propose a priority adversarial example that considers the attack priority for each model in cases targeting several models. The proposed method controls the attack success rate for each model by adjusting the weight of the attack function in the generation process, while maintaining minimum distortion. We used Tensorflow, a widely used machine learning library, and MNIST as the dataset. Experimental results show that the proposed method can control the attack success rate for each model by considering the attack priority of each model while maintaining minimum distortion (on average 3.95 and 2.45 in targeted and untargeted attacks, respectively).

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)

自引率

0.00%

发文量