Sparse Backdoor Attack Against Neural Networks

IF 1.5 4区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Computer Journal Pub Date : 2023-10-05 DOI:10.1093/comjnl/bxad100

Nan Zhong, Zhenxing Qian, Xinpeng Zhang

{"title":"Sparse Backdoor Attack Against Neural Networks","authors":"Nan Zhong, Zhenxing Qian, Xinpeng Zhang","doi":"10.1093/comjnl/bxad100","DOIUrl":null,"url":null,"abstract":"Abstract Recent studies show that neural networks are vulnerable to backdoor attacks, in which compromised networks behave normally for clean inputs but make mistakes when a pre-defined trigger appears. Although prior studies have designed various invisible triggers to avoid causing visual anomalies, they cannot evade some trigger detectors. In this paper, we consider the stealthiness of backdoor attacks from input space and feature representation space. We propose a novel backdoor attack named sparse backdoor attack, and investigate the minimum required trigger to induce the well-trained networks to make incorrect results. A U-net-based generator is employed to create triggers for each clean image. Considering the stealthiness of the trigger, we restrict the elements of the trigger between −1 and 1. In the aspect of the feature representation domain, we adopt an entanglement cost function to minimize the gap between feature representations of benign and malicious inputs. The inseparability of benign and malicious feature representations contributes to the stealthiness of our attack against various model diagnosis-based defences. We validate the effectiveness and generalization of our method by conducting extensive experiments on multiple datasets and networks.","PeriodicalId":50641,"journal":{"name":"Computer Journal","volume":"79 1","pages":"0"},"PeriodicalIF":1.5000,"publicationDate":"2023-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/comjnl/bxad100","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

Abstract Recent studies show that neural networks are vulnerable to backdoor attacks, in which compromised networks behave normally for clean inputs but make mistakes when a pre-defined trigger appears. Although prior studies have designed various invisible triggers to avoid causing visual anomalies, they cannot evade some trigger detectors. In this paper, we consider the stealthiness of backdoor attacks from input space and feature representation space. We propose a novel backdoor attack named sparse backdoor attack, and investigate the minimum required trigger to induce the well-trained networks to make incorrect results. A U-net-based generator is employed to create triggers for each clean image. Considering the stealthiness of the trigger, we restrict the elements of the trigger between −1 and 1. In the aspect of the feature representation domain, we adopt an entanglement cost function to minimize the gap between feature representations of benign and malicious inputs. The inseparability of benign and malicious feature representations contributes to the stealthiness of our attack against various model diagnosis-based defences. We validate the effectiveness and generalization of our method by conducting extensive experiments on multiple datasets and networks.

查看原文本刊更多论文

神经网络稀疏后门攻击

最近的研究表明，神经网络容易受到后门攻击，在这种攻击中，受损的网络在干净的输入下表现正常，但当预定义的触发器出现时就会出错。虽然以前的研究设计了各种不可见的触发器来避免造成视觉异常，但它们无法逃避一些触发探测器。本文从输入空间和特征表示空间两方面考虑后门攻击的隐蔽性。我们提出了一种新的后门攻击，称为稀疏后门攻击，并研究了诱导训练良好的网络产生错误结果所需的最小触发条件。使用基于u -net的生成器为每个干净图像创建触发器。考虑到触发器的隐蔽性，我们将触发器的元素限制在−1和1之间。在特征表示域方面，我们采用了纠缠代价函数来最小化良性和恶意输入的特征表示之间的差距。良性和恶意特征表示的不可分离性有助于我们对各种基于模型诊断的防御的攻击的隐蔽性。我们通过在多个数据集和网络上进行广泛的实验来验证我们方法的有效性和泛化性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computer Journal 工程技术-计算机：软件工程

CiteScore

3.60

自引率

7.10%

发文量

164

审稿时长

4.8 months

期刊介绍： The Computer Journal is one of the longest-established journals serving all branches of the academic computer science community. It is currently published in four sections.