基于不可见摄动的卷积神经网络模型后门嵌入

Proceedings of the Tenth ACM Conference on Data and Application Security and Privacy Pub Date : 2018-08-30 DOI:10.1145/3374664.3375751

C. Liao, Haoti Zhong, A. Squicciarini, Sencun Zhu, David J. Miller

{"title":"基于不可见摄动的卷积神经网络模型后门嵌入","authors":"C. Liao, Haoti Zhong, A. Squicciarini, Sencun Zhu, David J. Miller","doi":"10.1145/3374664.3375751","DOIUrl":null,"url":null,"abstract":"Deep learning models have consistently outperformed traditional machine learning models in various classification tasks, including image classification. As such, they have become increasingly prevalent in many real world applications including those where security is of great concern. Such popularity, however, may attract attackers to exploit the vulnerabilities of the deployed deep learning models and launch attacks against security-sensitive applications. In this paper, we focus on a specific type of data poisoning attack, which we refer to as a \\em backdoor injection attack. The main goal of the adversary performing such attack is to generate and inject a backdoor into a deep learning model that can be triggered to recognize certain embedded patterns with a target label of the attacker's choice. Additionally, a backdoor injection attack should occur in a stealthy manner, without undermining the efficacy of the victim model. Specifically, we propose two approaches for generating a backdoor that is hardly perceptible yet effective in poisoning the model. We consider two attack settings, with backdoor injection carried out either before model training or during model updating. We carry out extensive experimental evaluations under various assumptions on the adversary model, and demonstrate that such attacks can be effective and achieve a high attack success rate (above 90%) at a small cost of model accuracy loss with a small injection rate, even under the weakest assumption wherein the adversary has no knowledge either of the original training data or the classifier model.","PeriodicalId":171521,"journal":{"name":"Proceedings of the Tenth ACM Conference on Data and Application Security and Privacy","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"219","resultStr":"{\"title\":\"Backdoor Embedding in Convolutional Neural Network Models via Invisible Perturbation\",\"authors\":\"C. Liao, Haoti Zhong, A. Squicciarini, Sencun Zhu, David J. Miller\",\"doi\":\"10.1145/3374664.3375751\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep learning models have consistently outperformed traditional machine learning models in various classification tasks, including image classification. As such, they have become increasingly prevalent in many real world applications including those where security is of great concern. Such popularity, however, may attract attackers to exploit the vulnerabilities of the deployed deep learning models and launch attacks against security-sensitive applications. In this paper, we focus on a specific type of data poisoning attack, which we refer to as a \\\\em backdoor injection attack. The main goal of the adversary performing such attack is to generate and inject a backdoor into a deep learning model that can be triggered to recognize certain embedded patterns with a target label of the attacker's choice. Additionally, a backdoor injection attack should occur in a stealthy manner, without undermining the efficacy of the victim model. Specifically, we propose two approaches for generating a backdoor that is hardly perceptible yet effective in poisoning the model. We consider two attack settings, with backdoor injection carried out either before model training or during model updating. We carry out extensive experimental evaluations under various assumptions on the adversary model, and demonstrate that such attacks can be effective and achieve a high attack success rate (above 90%) at a small cost of model accuracy loss with a small injection rate, even under the weakest assumption wherein the adversary has no knowledge either of the original training data or the classifier model.\",\"PeriodicalId\":171521,\"journal\":{\"name\":\"Proceedings of the Tenth ACM Conference on Data and Application Security and Privacy\",\"volume\":\"23 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-08-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"219\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Tenth ACM Conference on Data and Application Security and Privacy\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3374664.3375751\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Tenth ACM Conference on Data and Application Security and Privacy","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3374664.3375751","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 219

摘要

在包括图像分类在内的各种分类任务中，深度学习模型一直优于传统机器学习模型。因此，它们在许多现实世界的应用程序中变得越来越普遍，包括那些非常关注安全性的应用程序。然而，这种流行可能会吸引攻击者利用已部署的深度学习模型的漏洞，并对安全敏感的应用程序发起攻击。在本文中，我们专注于一种特定类型的数据中毒攻击，我们将其称为em后门注入攻击。攻击者执行此类攻击的主要目标是生成并注入一个后门到一个深度学习模型中，该模型可以被触发以识别带有攻击者选择的目标标签的某些嵌入模式。此外，后门注入攻击应该以隐蔽的方式发生，而不会破坏受害者模型的功效。具体来说，我们提出了两种方法来生成难以察觉但有效地毒害模型的后门。我们考虑了两种攻击设置，在模型训练之前或在模型更新期间进行后门注入。我们在对手模型的各种假设下进行了大量的实验评估，并证明了这种攻击是有效的，即使在最弱的假设下，对手既不知道原始训练数据，也不知道分类器模型，这种攻击也可以在很小的注入率下以很小的模型精度损失为代价获得很高的攻击成功率(90%以上)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Backdoor Embedding in Convolutional Neural Network Models via Invisible Perturbation

Deep learning models have consistently outperformed traditional machine learning models in various classification tasks, including image classification. As such, they have become increasingly prevalent in many real world applications including those where security is of great concern. Such popularity, however, may attract attackers to exploit the vulnerabilities of the deployed deep learning models and launch attacks against security-sensitive applications. In this paper, we focus on a specific type of data poisoning attack, which we refer to as a \em backdoor injection attack. The main goal of the adversary performing such attack is to generate and inject a backdoor into a deep learning model that can be triggered to recognize certain embedded patterns with a target label of the attacker's choice. Additionally, a backdoor injection attack should occur in a stealthy manner, without undermining the efficacy of the victim model. Specifically, we propose two approaches for generating a backdoor that is hardly perceptible yet effective in poisoning the model. We consider two attack settings, with backdoor injection carried out either before model training or during model updating. We carry out extensive experimental evaluations under various assumptions on the adversary model, and demonstrate that such attacks can be effective and achieve a high attack success rate (above 90%) at a small cost of model accuracy loss with a small injection rate, even under the weakest assumption wherein the adversary has no knowledge either of the original training data or the classifier model.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the Tenth ACM Conference on Data and Application Security and Privacy

自引率

0.00%

发文量