隐身攻击的可行性和必然性

IF 1.4 4区数学 Q2 MATHEMATICS, APPLIED

IMA Journal of Applied Mathematics Pub Date : 2023-10-19 DOI:10.1093/imamat/hxad027

Ivan Yu. Tyukin, Desmond J. Higham, Eliyas Woldegeorgis, Alexander N. Gorban

{"title":"隐身攻击的可行性和必然性","authors":"Ivan Yu. Tyukin, Desmond J. Higham, Eliyas Woldegeorgis, Alexander N. Gorban","doi":"10.1093/imamat/hxad027","DOIUrl":null,"url":null,"abstract":"Abstract We develop and study new adversarial perturbations that enable an attacker to gain control over decisions in generic Artificial Intelligence (AI) systems including deep learning neural networks. In contrast to adversarial data modification, the attack mechanism we consider here involves alterations to the AI system itself. Such a stealth attack could be conducted by a mischievous, corrupt or disgruntled member of a software development team. It could also be made by those wishing to exploit a “democratization of AI” agenda, where network architectures and trained parameter sets are shared publicly. We develop a range of new implementable attack strategies with accompanying analysis, showing that with high probability a stealth attack can be made transparent, in the sense that system performance is unchanged on a fixed validation set which is unknown to the attacker, while evoking any desired output on a trigger input of interest. The attacker only needs to have estimates of the size of the validation set and the spread of the AI’s relevant latent space. In the case of deep learning neural networks, we show that a one neuron attack is possible—a modification to the weights and bias associated with a single neuron—revealing a vulnerability arising from over-parameterization. We illustrate these concepts using state of the art architectures on two standard image data sets. Guided by the theory and computational results, we also propose strategies to guard against stealth attacks.","PeriodicalId":56297,"journal":{"name":"IMA Journal of Applied Mathematics","volume":"39 1","pages":"0"},"PeriodicalIF":1.4000,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"The feasibility and inevitability of stealth attacks\",\"authors\":\"Ivan Yu. Tyukin, Desmond J. Higham, Eliyas Woldegeorgis, Alexander N. Gorban\",\"doi\":\"10.1093/imamat/hxad027\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract We develop and study new adversarial perturbations that enable an attacker to gain control over decisions in generic Artificial Intelligence (AI) systems including deep learning neural networks. In contrast to adversarial data modification, the attack mechanism we consider here involves alterations to the AI system itself. Such a stealth attack could be conducted by a mischievous, corrupt or disgruntled member of a software development team. It could also be made by those wishing to exploit a “democratization of AI” agenda, where network architectures and trained parameter sets are shared publicly. We develop a range of new implementable attack strategies with accompanying analysis, showing that with high probability a stealth attack can be made transparent, in the sense that system performance is unchanged on a fixed validation set which is unknown to the attacker, while evoking any desired output on a trigger input of interest. The attacker only needs to have estimates of the size of the validation set and the spread of the AI’s relevant latent space. In the case of deep learning neural networks, we show that a one neuron attack is possible—a modification to the weights and bias associated with a single neuron—revealing a vulnerability arising from over-parameterization. We illustrate these concepts using state of the art architectures on two standard image data sets. Guided by the theory and computational results, we also propose strategies to guard against stealth attacks.\",\"PeriodicalId\":56297,\"journal\":{\"name\":\"IMA Journal of Applied Mathematics\",\"volume\":\"39 1\",\"pages\":\"0\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2023-10-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IMA Journal of Applied Mathematics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/imamat/hxad027\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATHEMATICS, APPLIED\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IMA Journal of Applied Mathematics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/imamat/hxad027","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}

引用次数: 13

摘要

我们开发和研究了新的对抗性扰动，使攻击者能够控制包括深度学习神经网络在内的通用人工智能(AI)系统中的决策。与对抗性数据修改相反，我们在这里考虑的攻击机制涉及对AI系统本身的改变。这样的秘密攻击可能由软件开发团队中一个淘气的、腐败的或心怀不满的成员进行。它也可以由那些希望利用“人工智能民主化”议程的人制作，其中网络架构和训练参数集是公开共享的。我们开发了一系列新的可实现的攻击策略，并附带了分析，表明在高概率下，隐形攻击可以变得透明，从某种意义上说，系统性能在攻击者未知的固定验证集上保持不变，同时在感兴趣的触发输入上调用任何所需的输出。攻击者只需要估计验证集的大小和人工智能相关潜在空间的分布。在深度学习神经网络的情况下，我们表明单神经元攻击是可能的-修改与单个神经元相关的权重和偏差-揭示了过度参数化引起的脆弱性。我们在两个标准图像数据集上使用最先进的体系结构来说明这些概念。在理论和计算结果的指导下，提出了防范隐身攻击的策略。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

The feasibility and inevitability of stealth attacks

Abstract We develop and study new adversarial perturbations that enable an attacker to gain control over decisions in generic Artificial Intelligence (AI) systems including deep learning neural networks. In contrast to adversarial data modification, the attack mechanism we consider here involves alterations to the AI system itself. Such a stealth attack could be conducted by a mischievous, corrupt or disgruntled member of a software development team. It could also be made by those wishing to exploit a “democratization of AI” agenda, where network architectures and trained parameter sets are shared publicly. We develop a range of new implementable attack strategies with accompanying analysis, showing that with high probability a stealth attack can be made transparent, in the sense that system performance is unchanged on a fixed validation set which is unknown to the attacker, while evoking any desired output on a trigger input of interest. The attacker only needs to have estimates of the size of the validation set and the spread of the AI’s relevant latent space. In the case of deep learning neural networks, we show that a one neuron attack is possible—a modification to the weights and bias associated with a single neuron—revealing a vulnerability arising from over-parameterization. We illustrate these concepts using state of the art architectures on two standard image data sets. Guided by the theory and computational results, we also propose strategies to guard against stealth attacks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IMA Journal of Applied Mathematics 数学-应用数学

CiteScore

2.30

自引率

8.30%

发文量

审稿时长

24 months

期刊介绍： The IMA Journal of Applied Mathematics is a direct successor of the Journal of the Institute of Mathematics and its Applications which was started in 1965. It is an interdisciplinary journal that publishes research on mathematics arising in the physical sciences and engineering as well as suitable articles in the life sciences, social sciences, and finance. Submissions should address interesting and challenging mathematical problems arising in applications. A good balance between the development of the application(s) and the analysis is expected. Papers that either use established methods to address solved problems or that present analysis in the absence of applications will not be considered. The journal welcomes submissions in many research areas. Examples are: continuum mechanics materials science and elasticity, including boundary layer theory, combustion, complex flows and soft matter, electrohydrodynamics and magnetohydrodynamics, geophysical flows, granular flows, interfacial and free surface flows, vortex dynamics; elasticity theory; linear and nonlinear wave propagation, nonlinear optics and photonics; inverse problems; applied dynamical systems and nonlinear systems; mathematical physics; stochastic differential equations and stochastic dynamics; network science; industrial applications.