傲罗:在协作深度学习系统中防御中毒攻击

Proceedings of the 32nd Annual Conference on Computer Security Applications Pub Date : 2016-12-05 DOI:10.1145/2991079.2991125

Shiqi Shen, Shruti Tople, P. Saxena

{"title":"傲罗:在协作深度学习系统中防御中毒攻击","authors":"Shiqi Shen, Shruti Tople, P. Saxena","doi":"10.1145/2991079.2991125","DOIUrl":null,"url":null,"abstract":"Deep learning in a collaborative setting is emerging as a corner-stone of many upcoming applications, wherein untrusted users collaborate to generate more accurate models. From the security perspective, this opens collaborative deep learning to poisoning attacks, wherein adversarial users deliberately alter their inputs to mis-train the model. These attacks are known for machine learning systems in general, but their impact on new deep learning systems is not well-established. We investigate the setting of indirect collaborative deep learning --- a form of practical deep learning wherein users submit masked features rather than direct data. Indirect collaborative deep learning is preferred over direct, because it distributes the cost of computation and can be made privacy-preserving. In this paper, we study the susceptibility of collaborative deep learning systems to adversarial poisoning attacks. Specifically, we obtain the following empirical results on 2 popular datasets for handwritten images (MNIST) and traffic signs (GTSRB) used in auto-driving cars. For collaborative deep learning systems, we demonstrate that the attacks have 99% success rate for misclassifying specific target data while poisoning only 10% of the entire training dataset. As a defense, we propose Auror, a system that detects malicious users and generates an accurate model. The accuracy under the deployed defense on practical datasets is nearly unchanged when operating in the absence of attacks. The accuracy of a model trained using Auror drops by only 3% even when 30% of all the users are adversarial. Auror provides a strong guarantee against evasion; if the attacker tries to evade, its attack effectiveness is bounded.","PeriodicalId":419419,"journal":{"name":"Proceedings of the 32nd Annual Conference on Computer Security Applications","volume":"76 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"220","resultStr":"{\"title\":\"Auror: defending against poisoning attacks in collaborative deep learning systems\",\"authors\":\"Shiqi Shen, Shruti Tople, P. Saxena\",\"doi\":\"10.1145/2991079.2991125\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep learning in a collaborative setting is emerging as a corner-stone of many upcoming applications, wherein untrusted users collaborate to generate more accurate models. From the security perspective, this opens collaborative deep learning to poisoning attacks, wherein adversarial users deliberately alter their inputs to mis-train the model. These attacks are known for machine learning systems in general, but their impact on new deep learning systems is not well-established. We investigate the setting of indirect collaborative deep learning --- a form of practical deep learning wherein users submit masked features rather than direct data. Indirect collaborative deep learning is preferred over direct, because it distributes the cost of computation and can be made privacy-preserving. In this paper, we study the susceptibility of collaborative deep learning systems to adversarial poisoning attacks. Specifically, we obtain the following empirical results on 2 popular datasets for handwritten images (MNIST) and traffic signs (GTSRB) used in auto-driving cars. For collaborative deep learning systems, we demonstrate that the attacks have 99% success rate for misclassifying specific target data while poisoning only 10% of the entire training dataset. As a defense, we propose Auror, a system that detects malicious users and generates an accurate model. The accuracy under the deployed defense on practical datasets is nearly unchanged when operating in the absence of attacks. The accuracy of a model trained using Auror drops by only 3% even when 30% of all the users are adversarial. Auror provides a strong guarantee against evasion; if the attacker tries to evade, its attack effectiveness is bounded.\",\"PeriodicalId\":419419,\"journal\":{\"name\":\"Proceedings of the 32nd Annual Conference on Computer Security Applications\",\"volume\":\"76 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-12-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"220\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 32nd Annual Conference on Computer Security Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2991079.2991125\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 32nd Annual Conference on Computer Security Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2991079.2991125","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 220

摘要

协作环境中的深度学习正在成为许多即将到来的应用程序的基石，其中不受信任的用户协作以生成更准确的模型。从安全的角度来看，这为协作深度学习打开了中毒攻击的大门，在这种攻击中，敌对用户故意改变他们的输入来错误地训练模型。一般来说，这些攻击都是针对机器学习系统的，但它们对新的深度学习系统的影响还没有得到证实。我们研究了间接协作深度学习的设置——一种实用的深度学习形式，其中用户提交屏蔽特征而不是直接数据。间接协作深度学习比直接协作深度学习更受欢迎，因为它可以分配计算成本，并且可以保护隐私。在本文中，我们研究了协作深度学习系统对对抗性中毒攻击的敏感性。具体而言，我们在自动驾驶汽车中使用的手写图像(MNIST)和交通标志(GTSRB)的两个流行数据集上获得了以下实证结果。对于协作深度学习系统，我们证明了攻击对特定目标数据的错误分类有99%的成功率，而整个训练数据集的中毒率仅为10%。作为防御，我们提出了Auror，一个检测恶意用户并生成准确模型的系统。在实际数据集上部署防御的准确性在没有攻击的情况下几乎没有变化。即使30%的用户是敌对的，使用傲罗训练的模型的准确性也只下降了3%。傲罗提供了强大的保证，防止逃避;如果攻击者试图逃避，其攻击效果是有限的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Auror: defending against poisoning attacks in collaborative deep learning systems

Deep learning in a collaborative setting is emerging as a corner-stone of many upcoming applications, wherein untrusted users collaborate to generate more accurate models. From the security perspective, this opens collaborative deep learning to poisoning attacks, wherein adversarial users deliberately alter their inputs to mis-train the model. These attacks are known for machine learning systems in general, but their impact on new deep learning systems is not well-established. We investigate the setting of indirect collaborative deep learning --- a form of practical deep learning wherein users submit masked features rather than direct data. Indirect collaborative deep learning is preferred over direct, because it distributes the cost of computation and can be made privacy-preserving. In this paper, we study the susceptibility of collaborative deep learning systems to adversarial poisoning attacks. Specifically, we obtain the following empirical results on 2 popular datasets for handwritten images (MNIST) and traffic signs (GTSRB) used in auto-driving cars. For collaborative deep learning systems, we demonstrate that the attacks have 99% success rate for misclassifying specific target data while poisoning only 10% of the entire training dataset. As a defense, we propose Auror, a system that detects malicious users and generates an accurate model. The accuracy under the deployed defense on practical datasets is nearly unchanged when operating in the absence of attacks. The accuracy of a model trained using Auror drops by only 3% even when 30% of all the users are adversarial. Auror provides a strong guarantee against evasion; if the attacker tries to evade, its attack effectiveness is bounded.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 32nd Annual Conference on Computer Security Applications

自引率

0.00%

发文量