Anibal Pedraza, Nerea Leon, Harbinder Singh, Oscar Deniz, Gloria Bueno
{"title":"Characterizing Natural Adversarial Examples Through Activation Map Analysis","authors":"Anibal Pedraza, Nerea Leon, Harbinder Singh, Oscar Deniz, Gloria Bueno","doi":"10.1049/ipr2.70123","DOIUrl":null,"url":null,"abstract":"<p>Adversarial examples are an intriguing and critical topic in the field of machine learning. The impact of malignant perturbations on deep learning-based systems, especially in safety-critical applications, highlights a significant security concern. While most research has focused on artificially generated adversarial attacks–crafted through optimization algorithms and constrained perturbations, it is important to note that adversarial examples can also occur naturally, without any artificial manipulation, during the prediction of real-world images. These naturally occurring adversarial examples pose unique challenges, as they are harder to detect and interpret. Despite their importance, the study of natural adversarial examples remains in its early stages. Fundamental questions remain unanswered: Do natural adversarial examples exhibit similar behaviours or properties as artificially generated ones? How should models be adapted to improve their robustness against such natural inputs? To address these questions, this work proposes an in-depth analysis of activation maps to compare the internal behaviour of neural networks when processing clean images, artificially perturbed inputs and natural adversarial examples. A set of quantitative metrics is extracted from activation heatmaps at various network layers, including mean activation intensity, centroid displacement and standard reference image quality metrics. These measurements enable a systematic comparison of how the network attends to different image regions under varying conditions. The experimental results demonstrate that natural adversarial examples exhibit statistically significant differences in activation patterns compared to their artificial counterparts, suggesting that they may require distinct strategies for detection and defence.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2025-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70123","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Image Processing","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/ipr2.70123","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Adversarial examples are an intriguing and critical topic in the field of machine learning. The impact of malignant perturbations on deep learning-based systems, especially in safety-critical applications, highlights a significant security concern. While most research has focused on artificially generated adversarial attacks–crafted through optimization algorithms and constrained perturbations, it is important to note that adversarial examples can also occur naturally, without any artificial manipulation, during the prediction of real-world images. These naturally occurring adversarial examples pose unique challenges, as they are harder to detect and interpret. Despite their importance, the study of natural adversarial examples remains in its early stages. Fundamental questions remain unanswered: Do natural adversarial examples exhibit similar behaviours or properties as artificially generated ones? How should models be adapted to improve their robustness against such natural inputs? To address these questions, this work proposes an in-depth analysis of activation maps to compare the internal behaviour of neural networks when processing clean images, artificially perturbed inputs and natural adversarial examples. A set of quantitative metrics is extracted from activation heatmaps at various network layers, including mean activation intensity, centroid displacement and standard reference image quality metrics. These measurements enable a systematic comparison of how the network attends to different image regions under varying conditions. The experimental results demonstrate that natural adversarial examples exhibit statistically significant differences in activation patterns compared to their artificial counterparts, suggesting that they may require distinct strategies for detection and defence.
期刊介绍:
The IET Image Processing journal encompasses research areas related to the generation, processing and communication of visual information. The focus of the journal is the coverage of the latest research results in image and video processing, including image generation and display, enhancement and restoration, segmentation, colour and texture analysis, coding and communication, implementations and architectures as well as innovative applications.
Principal topics include:
Generation and Display - Imaging sensors and acquisition systems, illumination, sampling and scanning, quantization, colour reproduction, image rendering, display and printing systems, evaluation of image quality.
Processing and Analysis - Image enhancement, restoration, segmentation, registration, multispectral, colour and texture processing, multiresolution processing and wavelets, morphological operations, stereoscopic and 3-D processing, motion detection and estimation, video and image sequence processing.
Implementations and Architectures - Image and video processing hardware and software, design and construction, architectures and software, neural, adaptive, and fuzzy processing.
Coding and Transmission - Image and video compression and coding, compression standards, noise modelling, visual information networks, streamed video.
Retrieval and Multimedia - Storage of images and video, database design, image retrieval, video annotation and editing, mixed media incorporating visual information, multimedia systems and applications, image and video watermarking, steganography.
Applications - Innovative application of image and video processing technologies to any field, including life sciences, earth sciences, astronomy, document processing and security.
Current Special Issue Call for Papers:
Evolutionary Computation for Image Processing - https://digital-library.theiet.org/files/IET_IPR_CFP_EC.pdf
AI-Powered 3D Vision - https://digital-library.theiet.org/files/IET_IPR_CFP_AIPV.pdf
Multidisciplinary advancement of Imaging Technologies: From Medical Diagnostics and Genomics to Cognitive Machine Vision, and Artificial Intelligence - https://digital-library.theiet.org/files/IET_IPR_CFP_IST.pdf
Deep Learning for 3D Reconstruction - https://digital-library.theiet.org/files/IET_IPR_CFP_DLR.pdf