检测对深度神经网络的对抗性示例攻击

F. Carrara, F. Falchi, R. Caldelli, Giuseppe Amato, Roberta Fumarola, Rudy Becarelli
{"title":"检测对深度神经网络的对抗性示例攻击","authors":"F. Carrara, F. Falchi, R. Caldelli, Giuseppe Amato, Roberta Fumarola, Rudy Becarelli","doi":"10.1145/3095713.3095753","DOIUrl":null,"url":null,"abstract":"Deep learning has recently become the state of the art in many computer vision applications and in image classification in particular. However, recent works have shown that it is quite easy to create adversarial examples, i.e., images intentionally created or modified to cause the deep neural network to make a mistake. They are like optical illusions for machines containing changes unnoticeable to the human eye. This represents a serious threat for machine learning methods. In this paper, we investigate the robustness of the representations learned by the fooled neural network, analyzing the activations of its hidden layers. Specifically, we tested scoring approaches used for kNN classification, in order to distinguishing between correctly classified authentic images and adversarial examples. The results show that hidden layers activations can be used to detect incorrect classifications caused by adversarial attacks.","PeriodicalId":310224,"journal":{"name":"Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"33","resultStr":"{\"title\":\"Detecting adversarial example attacks to deep neural networks\",\"authors\":\"F. Carrara, F. Falchi, R. Caldelli, Giuseppe Amato, Roberta Fumarola, Rudy Becarelli\",\"doi\":\"10.1145/3095713.3095753\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep learning has recently become the state of the art in many computer vision applications and in image classification in particular. However, recent works have shown that it is quite easy to create adversarial examples, i.e., images intentionally created or modified to cause the deep neural network to make a mistake. They are like optical illusions for machines containing changes unnoticeable to the human eye. This represents a serious threat for machine learning methods. In this paper, we investigate the robustness of the representations learned by the fooled neural network, analyzing the activations of its hidden layers. Specifically, we tested scoring approaches used for kNN classification, in order to distinguishing between correctly classified authentic images and adversarial examples. The results show that hidden layers activations can be used to detect incorrect classifications caused by adversarial attacks.\",\"PeriodicalId\":310224,\"journal\":{\"name\":\"Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-06-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"33\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3095713.3095753\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3095713.3095753","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 33

摘要

深度学习最近已经成为许多计算机视觉应用的最新技术,特别是在图像分类方面。然而,最近的研究表明,创建对抗性示例非常容易,即故意创建或修改图像以导致深度神经网络犯错误。它们就像机器的视觉错觉,包含人眼无法察觉的变化。这对机器学习方法构成了严重威胁。在本文中,我们通过分析其隐藏层的激活来研究被愚弄神经网络学习到的表征的鲁棒性。具体来说,我们测试了用于kNN分类的评分方法,以区分正确分类的真实图像和对抗示例。结果表明,隐藏层激活可以用于检测由对抗性攻击引起的错误分类。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Detecting adversarial example attacks to deep neural networks
Deep learning has recently become the state of the art in many computer vision applications and in image classification in particular. However, recent works have shown that it is quite easy to create adversarial examples, i.e., images intentionally created or modified to cause the deep neural network to make a mistake. They are like optical illusions for machines containing changes unnoticeable to the human eye. This represents a serious threat for machine learning methods. In this paper, we investigate the robustness of the representations learned by the fooled neural network, analyzing the activations of its hidden layers. Specifically, we tested scoring approaches used for kNN classification, in order to distinguishing between correctly classified authentic images and adversarial examples. The results show that hidden layers activations can be used to detect incorrect classifications caused by adversarial attacks.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信