A Comprehensive Study on Robustness of Image Classification Models: Benchmarking and Rethinking

IF 11.6 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal of Computer Vision Pub Date : 2024-08-09 DOI:10.1007/s11263-024-02196-3

Chang Liu, Yinpeng Dong, Wenzhao Xiang, Xiao Yang, Hang Su, Jun Zhu, Yuefeng Chen, Yuan He, Hui Xue, Shibao Zheng

{"title":"A Comprehensive Study on Robustness of Image Classification Models: Benchmarking and Rethinking","authors":"Chang Liu, Yinpeng Dong, Wenzhao Xiang, Xiao Yang, Hang Su, Jun Zhu, Yuefeng Chen, Yuan He, Hui Xue, Shibao Zheng","doi":"10.1007/s11263-024-02196-3","DOIUrl":null,"url":null,"abstract":"<p>The robustness of deep neural networks is frequently compromised when faced with adversarial examples, common corruptions, and distribution shifts, posing a significant research challenge in the advancement of deep learning. Although new deep learning methods and robustness improvement techniques have been constantly proposed, the robustness evaluations of existing methods are often inadequate due to their rapid development, diverse noise patterns, and simple evaluation metrics. Without thorough robustness evaluations, it is hard to understand the advances in the field and identify the effective methods. In this paper, we establish a comprehensive robustness benchmark called <b>ARES-Bench</b> on the image classification task. In our benchmark, we evaluate the robustness of 61 typical deep learning models on ImageNet with diverse architectures (e.g., CNNs, Transformers) and learning algorithms (e.g., normal supervised training, pre-training, adversarial training) under numerous adversarial attacks and out-of-distribution (OOD) datasets. Using robustness curves as the major evaluation criteria, we conduct large-scale experiments and draw several important findings, including: (1) there exists an intrinsic trade-off between the adversarial and natural robustness of specific noise types for the same model architecture; (2) adversarial training effectively improves adversarial robustness, especially when performed on Transformer architectures; (3) pre-training significantly enhances natural robustness by leveraging larger training datasets, incorporating multi-modal data, or employing self-supervised learning techniques. Based on ARES-Bench, we further analyze the training tricks in large-scale adversarial training on ImageNet. Through tailored training settings, we achieve a new state-of-the-art in adversarial robustness. We have made the benchmarking results and code platform publicly available.</p>","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"55 1","pages":""},"PeriodicalIF":11.6000,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computer Vision","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11263-024-02196-3","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

The robustness of deep neural networks is frequently compromised when faced with adversarial examples, common corruptions, and distribution shifts, posing a significant research challenge in the advancement of deep learning. Although new deep learning methods and robustness improvement techniques have been constantly proposed, the robustness evaluations of existing methods are often inadequate due to their rapid development, diverse noise patterns, and simple evaluation metrics. Without thorough robustness evaluations, it is hard to understand the advances in the field and identify the effective methods. In this paper, we establish a comprehensive robustness benchmark called ARES-Bench on the image classification task. In our benchmark, we evaluate the robustness of 61 typical deep learning models on ImageNet with diverse architectures (e.g., CNNs, Transformers) and learning algorithms (e.g., normal supervised training, pre-training, adversarial training) under numerous adversarial attacks and out-of-distribution (OOD) datasets. Using robustness curves as the major evaluation criteria, we conduct large-scale experiments and draw several important findings, including: (1) there exists an intrinsic trade-off between the adversarial and natural robustness of specific noise types for the same model architecture; (2) adversarial training effectively improves adversarial robustness, especially when performed on Transformer architectures; (3) pre-training significantly enhances natural robustness by leveraging larger training datasets, incorporating multi-modal data, or employing self-supervised learning techniques. Based on ARES-Bench, we further analyze the training tricks in large-scale adversarial training on ImageNet. Through tailored training settings, we achieve a new state-of-the-art in adversarial robustness. We have made the benchmarking results and code platform publicly available.

Abstract Image

查看原文本刊更多论文

图像分类模型鲁棒性综合研究：基准测试与反思

深度神经网络在面对对抗性示例、常见破坏和分布偏移时，其鲁棒性经常会受到影响，这给深度学习的发展带来了巨大的研究挑战。虽然新的深度学习方法和鲁棒性改进技术不断被提出，但由于其发展迅速、噪声模式多样、评估指标简单，对现有方法的鲁棒性评估往往不够充分。如果不进行全面的鲁棒性评估，就很难了解该领域的进展并找出有效的方法。在本文中，我们针对图像分类任务建立了一个名为 ARES-Bench 的综合鲁棒性基准。在我们的基准中，我们评估了 61 个典型深度学习模型在 ImageNet 上的鲁棒性，这些模型具有不同的架构（如 CNN、Transformers）和学习算法（如正常监督训练、预训练、对抗训练），在众多对抗性攻击和分布外（OOD）数据集下具有不同的鲁棒性。我们使用鲁棒性曲线作为主要评估标准，进行了大规模实验，并得出了几个重要发现，包括：（1）对于同一模型架构，特定噪声类型的对抗鲁棒性和自然鲁棒性之间存在内在权衡；（2）对抗训练可有效提高对抗鲁棒性，尤其是在变形器架构上执行时；（3）通过利用更大的训练数据集、结合多模态数据或采用自监督学习技术，预训练可显著提高自然鲁棒性。基于 ARES-Bench，我们进一步分析了在 ImageNet 上进行大规模对抗训练时的训练技巧。通过量身定制的训练设置，我们在对抗鲁棒性方面达到了新的先进水平。我们公开了基准测试结果和代码平台。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal of Computer Vision 工程技术-计算机：人工智能

CiteScore

29.80

自引率

2.10%

发文量

163

审稿时长

6 months

期刊介绍： The International Journal of Computer Vision (IJCV) serves as a platform for sharing new research findings in the rapidly growing field of computer vision. It publishes 12 issues annually and presents high-quality, original contributions to the science and engineering of computer vision. The journal encompasses various types of articles to cater to different research outputs. Regular articles, which span up to 25 journal pages, focus on significant technical advancements that are of broad interest to the field. These articles showcase substantial progress in computer vision. Short articles, limited to 10 pages, offer a swift publication path for novel research outcomes. They provide a quicker means for sharing new findings with the computer vision community. Survey articles, comprising up to 30 pages, offer critical evaluations of the current state of the art in computer vision or offer tutorial presentations of relevant topics. These articles provide comprehensive and insightful overviews of specific subject areas. In addition to technical articles, the journal also includes book reviews, position papers, and editorials by prominent scientific figures. These contributions serve to complement the technical content and provide valuable perspectives. The journal encourages authors to include supplementary material online, such as images, video sequences, data sets, and software. This additional material enhances the understanding and reproducibility of the published research. Overall, the International Journal of Computer Vision is a comprehensive publication that caters to researchers in this rapidly growing field. It covers a range of article types, offers additional online resources, and facilitates the dissemination of impactful research.