Similarity-driven adversarial testing of neural networks

IF 7.2 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Knowledge-Based Systems Pub Date : 2024-10-15 DOI:10.1016/j.knosys.2024.112621

Katarzyna Filus, Joanna Domańska

{"title":"Similarity-driven adversarial testing of neural networks","authors":"Katarzyna Filus, Joanna Domańska","doi":"10.1016/j.knosys.2024.112621","DOIUrl":null,"url":null,"abstract":"<div><div>Although Convolutional Neural Networks (CNNs) are among the most important algorithms of computer vision and the artificial intelligence-based systems, they are vulnerable to adversarial attacks. Such attacks can cause dangerous consequences in real-life deployments. Consequently, testing of the artificial intelligence-based systems from their perspective is crucial to reliably support human prediction and decision-making through computation techniques under varying conditions. While proposing new effective attacks is important for neural network testing, it is also crucial to design effective strategies that can be used to choose target labels for these attacks. That is why, in this paper we propose a novel similarity-driven adversarial testing methodology for target label choosing. Our motivation is that CNNs, similarly to humans, tend to make mistakes mostly among categories they perceive similar. Thus, the effort to make models predict a particular class is not equal for all classes. Motivated by this, we propose to use the most and least similar labels to the ground truth according to different similarity measures to choose the target label for an adversarial attack. They can be treated as best- and worst-case scenarios in practical and transparent testing methodologies. As similarity is one of the key components of human cognition and categorization, the approach presents a shift towards a more human-centered security testing of deep neural networks. The obtained numerical results show the superiority of the proposed methods to the existing strategies in the targeted and the non-targeted testing setups.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2000,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705124012553","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Although Convolutional Neural Networks (CNNs) are among the most important algorithms of computer vision and the artificial intelligence-based systems, they are vulnerable to adversarial attacks. Such attacks can cause dangerous consequences in real-life deployments. Consequently, testing of the artificial intelligence-based systems from their perspective is crucial to reliably support human prediction and decision-making through computation techniques under varying conditions. While proposing new effective attacks is important for neural network testing, it is also crucial to design effective strategies that can be used to choose target labels for these attacks. That is why, in this paper we propose a novel similarity-driven adversarial testing methodology for target label choosing. Our motivation is that CNNs, similarly to humans, tend to make mistakes mostly among categories they perceive similar. Thus, the effort to make models predict a particular class is not equal for all classes. Motivated by this, we propose to use the most and least similar labels to the ground truth according to different similarity measures to choose the target label for an adversarial attack. They can be treated as best- and worst-case scenarios in practical and transparent testing methodologies. As similarity is one of the key components of human cognition and categorization, the approach presents a shift towards a more human-centered security testing of deep neural networks. The obtained numerical results show the superiority of the proposed methods to the existing strategies in the targeted and the non-targeted testing setups.

查看原文本刊更多论文

神经网络的相似性驱动对抗测试

虽然卷积神经网络（CNN）是计算机视觉和人工智能系统中最重要的算法之一，但它们很容易受到恶意攻击。这种攻击会在实际部署中造成危险后果。因此，从它们的角度对基于人工智能的系统进行测试，对于在不同条件下通过计算技术可靠地支持人类预测和决策至关重要。虽然提出新的有效攻击对于神经网络测试很重要，但设计有效的策略来选择这些攻击的目标标签也很关键。因此，我们在本文中提出了一种用于选择目标标签的新型相似性驱动对抗测试方法。我们的动机是，CNN 与人类类似，往往会在它们认为相似的类别中犯大部分错误。因此，让模型预测特定类别的努力并不等同于预测所有类别。受此启发，我们建议根据不同的相似性度量，使用与地面实况最相似和最不相似的标签来选择对抗性攻击的目标标签。在实用、透明的测试方法中，它们可被视为最佳和最差情况。由于相似性是人类认知和分类的关键要素之一，该方法提出了一种以人为本的深度神经网络安全测试方法。所获得的数值结果表明，在目标和非目标测试设置中，所提出的方法优于现有的策略。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Knowledge-Based Systems 工程技术-计算机：人工智能

CiteScore

14.80

自引率

12.50%

发文量

1245

审稿时长

7.8 months

期刊介绍： Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.