SentiNet: Detecting Localized Universal Attacks Against Deep Learning Systems

2020 IEEE Security and Privacy Workshops (SPW) Pub Date : 2018-12-02 DOI:10.1109/SPW50608.2020.00025

Edward Chou, Florian Tramèr, Giancarlo Pellegrino

{"title":"SentiNet: Detecting Localized Universal Attacks Against Deep Learning Systems","authors":"Edward Chou, Florian Tramèr, Giancarlo Pellegrino","doi":"10.1109/SPW50608.2020.00025","DOIUrl":null,"url":null,"abstract":"SentiNet is a novel detection framework for localized universal attacks on neural networks. These attacks restrict adversarial noise to contiguous portions of an image and are reusable with different images-constraints that prove useful for generating physically-realizable attacks. Unlike most other works on adversarial detection, SentiNet does not require training a model or preknowledge of an attack prior to detection. Our approach is appealing due to the large number of possible mechanisms and attack-vectors that an attack-specific defense would have to consider. By leveraging the neural network's susceptibility to attacks and by using techniques from model interpretability and object detection as detection mechanisms, SentiNet turns a weakness of a model into a strength. We demonstrate the effectiveness of SentiNet on three different attacks-i.e., data poisoning attacks, trojaned networks, and adversarial patches (including physically realizable attacks)-and show that our defense is able to achieve very competitive performance metrics for all three threats. Finally, we show that SentiNet is robust against strong adaptive adversaries, who build adversarial patches that specifically target the components of SentiNet's architecture.","PeriodicalId":413600,"journal":{"name":"2020 IEEE Security and Privacy Workshops (SPW)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"187","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE Security and Privacy Workshops (SPW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPW50608.2020.00025","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 187

Abstract

SentiNet is a novel detection framework for localized universal attacks on neural networks. These attacks restrict adversarial noise to contiguous portions of an image and are reusable with different images-constraints that prove useful for generating physically-realizable attacks. Unlike most other works on adversarial detection, SentiNet does not require training a model or preknowledge of an attack prior to detection. Our approach is appealing due to the large number of possible mechanisms and attack-vectors that an attack-specific defense would have to consider. By leveraging the neural network's susceptibility to attacks and by using techniques from model interpretability and object detection as detection mechanisms, SentiNet turns a weakness of a model into a strength. We demonstrate the effectiveness of SentiNet on three different attacks-i.e., data poisoning attacks, trojaned networks, and adversarial patches (including physically realizable attacks)-and show that our defense is able to achieve very competitive performance metrics for all three threats. Finally, we show that SentiNet is robust against strong adaptive adversaries, who build adversarial patches that specifically target the components of SentiNet's architecture.

查看原文本刊更多论文

哨兵:检测针对深度学习系统的局部通用攻击

SentiNet是一种针对神经网络局部通用攻击的新型检测框架。这些攻击将对抗性噪声限制在图像的连续部分，并且可以与不同的图像重用——这些约束被证明对生成物理上可实现的攻击很有用。与大多数其他对抗性检测工作不同，SentiNet不需要在检测之前训练模型或预知攻击。我们的方法很有吸引力，因为针对特定攻击的防御必须考虑大量可能的机制和攻击向量。通过利用神经网络对攻击的敏感性，并使用模型可解释性和对象检测技术作为检测机制，SentiNet将模型的弱点转化为优势。我们演示了sentinel在三种不同攻击中的有效性:，数据中毒攻击，木马网络和对抗性补丁(包括物理上可实现的攻击)，并表明我们的防御能够在所有三种威胁中实现非常有竞争力的性能指标。最后，我们证明了SentiNet对强自适应对手的鲁棒性，这些对手构建了专门针对SentiNet架构组件的对抗性补丁。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 IEEE Security and Privacy Workshops (SPW)

自引率

0.00%

发文量