Certified Accuracy and Robustness: How different architectures stand up to adversarial attacks

IF 4.3

Intelligent Systems with Applications Pub Date : 2025-07-07 DOI:10.1016/j.iswa.2025.200555

Azryl Elmy Sarih , Nagender Aneja , Ong Wee Hong

{"title":"Certified Accuracy and Robustness: How different architectures stand up to adversarial attacks","authors":"Azryl Elmy Sarih , Nagender Aneja , Ong Wee Hong","doi":"10.1016/j.iswa.2025.200555","DOIUrl":null,"url":null,"abstract":"<div><div>Adversarial attacks are a concern for image classification using neural networks. Numerous methods have been created to minimize the effects of attacks, where the best defense against such attacks is through adversarial training, which has proven to be the most successful to date. Due to the nature of adversarial attacks, it is difficult to assess the capabilities of a network to defend. The standard method of assessing a network’s performance in supervised image classification tasks is based on accuracy. However, this assessment method, while still important, is insufficient when adversarial attacks are included. A new metric called certified accuracy is used to assess network performance when samples are perturbed by adversarial noise. This paper supplements certified accuracy with an abstention rate to give more insight into the network’s robustness. Abstention rate measures the percentage of the network that failed to keep its prediction unchanged as the perturbation strength increases from zero to specified strength. The study focuses on popular and good-performing CNN-based architectures, specifically EfficientNet-B7, ResNet-50, ResNet-101, Wide-ResNet-101, and transformer architectures such as CaiT and ViT-B/16. The selected architectures are trained in adversarial and standard methods and then certified on CIFAR-10 datasets perturbed with Gaussian noises of different strengths. Our results show that transformers are more resilient to adversarial attacks than CNN-based architectures by a significant margin. Transformers exhibit better certified accuracy and tolerance against stronger noises than CNN-based architectures, demonstrating good robustness with and without adversarial training. The width and depth of a network have little effect on achieving robustness against adversarial attacks, but rather, the techniques that are deployed in the network are more impactful, where attention mechanisms have been shown to improve a network’s robustness.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"27 ","pages":"Article 200555"},"PeriodicalIF":4.3000,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligent Systems with Applications","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S266730532500081X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Adversarial attacks are a concern for image classification using neural networks. Numerous methods have been created to minimize the effects of attacks, where the best defense against such attacks is through adversarial training, which has proven to be the most successful to date. Due to the nature of adversarial attacks, it is difficult to assess the capabilities of a network to defend. The standard method of assessing a network’s performance in supervised image classification tasks is based on accuracy. However, this assessment method, while still important, is insufficient when adversarial attacks are included. A new metric called certified accuracy is used to assess network performance when samples are perturbed by adversarial noise. This paper supplements certified accuracy with an abstention rate to give more insight into the network’s robustness. Abstention rate measures the percentage of the network that failed to keep its prediction unchanged as the perturbation strength increases from zero to specified strength. The study focuses on popular and good-performing CNN-based architectures, specifically EfficientNet-B7, ResNet-50, ResNet-101, Wide-ResNet-101, and transformer architectures such as CaiT and ViT-B/16. The selected architectures are trained in adversarial and standard methods and then certified on CIFAR-10 datasets perturbed with Gaussian noises of different strengths. Our results show that transformers are more resilient to adversarial attacks than CNN-based architectures by a significant margin. Transformers exhibit better certified accuracy and tolerance against stronger noises than CNN-based architectures, demonstrating good robustness with and without adversarial training. The width and depth of a network have little effect on achieving robustness against adversarial attacks, but rather, the techniques that are deployed in the network are more impactful, where attention mechanisms have been shown to improve a network’s robustness.

查看原文本刊更多论文

认证的准确性和健壮性：不同的架构如何经受对抗性攻击

对抗性攻击是使用神经网络进行图像分类的一个问题。为了尽量减少攻击的影响，已经创建了许多方法，其中针对此类攻击的最佳防御是通过对抗性训练，这已被证明是迄今为止最成功的。由于对抗性攻击的性质，很难评估网络的防御能力。评估网络在监督图像分类任务中的性能的标准方法是基于准确性。然而，这种评估方法虽然仍然很重要，但在包括对抗性攻击时就不够了。当样本受到对抗性噪声干扰时，使用了一种称为认证精度的新度量来评估网络性能。本文用弃权率补充认证精度，以更深入地了解网络的鲁棒性。弃权率衡量的是当扰动强度从零增加到指定强度时，网络未能保持其预测不变的百分比。该研究的重点是流行的和性能良好的基于cnn的架构，特别是高效网- b7、ResNet-50、ResNet-101、Wide-ResNet-101，以及CaiT和ViT-B/16等变压器架构。采用对抗性和标准方法对所选架构进行训练，然后在受不同强度高斯噪声干扰的CIFAR-10数据集上进行认证。我们的研究结果表明，与基于cnn的架构相比，变压器对对抗性攻击的弹性更强。与基于cnn的架构相比，变压器具有更好的认证精度和对更强噪声的容忍度，在有无对抗性训练的情况下都表现出良好的鲁棒性。网络的宽度和深度对实现对对抗性攻击的鲁棒性几乎没有影响，相反，在网络中部署的技术更有影响力，其中注意力机制已被证明可以提高网络的鲁棒性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Intelligent Systems with Applications

CiteScore

5.60

自引率

0.00%

发文量