Zhentong Zhang , Xinde Li , Pengfei Zhang , Kui Wang , Tianrong Gao , Tao Shen
{"title":"攻击跟踪:基于证据扩散模型的语义级对抗性攻击位置跟踪","authors":"Zhentong Zhang , Xinde Li , Pengfei Zhang , Kui Wang , Tianrong Gao , Tao Shen","doi":"10.1016/j.neucom.2025.131535","DOIUrl":null,"url":null,"abstract":"<div><div>Adversarial attacks pose a significant threat to AI systems, yet existing detection methods mainly focus on image-level threats, limiting fine-grained localization of perturbations. To address this challenge, we propose AttackTracer, the first semantic-level localization framework specifically designed for instance-level adversarial attacks. Instance-level adversarial perturbations are typically sparse and localized, which aligns naturally with the capabilities of diffusion models to progressively reconstruct sparse structures from stochastic noise. Building on this property, AttackTracer models the adversarial mask as a conditional distribution given the adversarial image, allowing iterative refinement and effective recovery of attack regions. To address the inherent instability of diffusion sampling, we introduce the Temporal Evidence Fusion Strategy (TEFS). TEFS integrates Dempster–Shafer theory with a signal-to-noise-ratio (SNR)-guided temporal ensemble, aggregating multi-step predictions to mitigate conflicts and uncertainty, thus achieving robust inference. Furthermore, adversarial perturbations often manifest as subtle high-frequency and edge distortions. To capture these, AttackTracer employs two complementary modules: the Wavelet Frequency Fusion Block (WFFB), which extracts multi-scale frequency features via Discrete Wavelet Transform to enhance sensitivity to sparse perturbations, and the Edge Feature Enhancement Module (EFEM), which models multi-granularity edge structures using parallel branches and FFT to detect boundary distortions. Together, WFFB and EFEM provide complementary views of perturbation patterns. Extensive experiments demonstrate that AttackTracer achieves superior traceability of adversarial regions while maintaining robustness across stochastic sampling and varying scales, highlighting its effectiveness for instance-level attack localization.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"656 ","pages":"Article 131535"},"PeriodicalIF":6.5000,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"AttackTracer: Semantic-level adversarial attack location traceability via evidential diffusion model\",\"authors\":\"Zhentong Zhang , Xinde Li , Pengfei Zhang , Kui Wang , Tianrong Gao , Tao Shen\",\"doi\":\"10.1016/j.neucom.2025.131535\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Adversarial attacks pose a significant threat to AI systems, yet existing detection methods mainly focus on image-level threats, limiting fine-grained localization of perturbations. To address this challenge, we propose AttackTracer, the first semantic-level localization framework specifically designed for instance-level adversarial attacks. Instance-level adversarial perturbations are typically sparse and localized, which aligns naturally with the capabilities of diffusion models to progressively reconstruct sparse structures from stochastic noise. Building on this property, AttackTracer models the adversarial mask as a conditional distribution given the adversarial image, allowing iterative refinement and effective recovery of attack regions. To address the inherent instability of diffusion sampling, we introduce the Temporal Evidence Fusion Strategy (TEFS). TEFS integrates Dempster–Shafer theory with a signal-to-noise-ratio (SNR)-guided temporal ensemble, aggregating multi-step predictions to mitigate conflicts and uncertainty, thus achieving robust inference. Furthermore, adversarial perturbations often manifest as subtle high-frequency and edge distortions. To capture these, AttackTracer employs two complementary modules: the Wavelet Frequency Fusion Block (WFFB), which extracts multi-scale frequency features via Discrete Wavelet Transform to enhance sensitivity to sparse perturbations, and the Edge Feature Enhancement Module (EFEM), which models multi-granularity edge structures using parallel branches and FFT to detect boundary distortions. Together, WFFB and EFEM provide complementary views of perturbation patterns. Extensive experiments demonstrate that AttackTracer achieves superior traceability of adversarial regions while maintaining robustness across stochastic sampling and varying scales, highlighting its effectiveness for instance-level attack localization.</div></div>\",\"PeriodicalId\":19268,\"journal\":{\"name\":\"Neurocomputing\",\"volume\":\"656 \",\"pages\":\"Article 131535\"},\"PeriodicalIF\":6.5000,\"publicationDate\":\"2025-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurocomputing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0925231225022076\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225022076","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
AttackTracer: Semantic-level adversarial attack location traceability via evidential diffusion model
Adversarial attacks pose a significant threat to AI systems, yet existing detection methods mainly focus on image-level threats, limiting fine-grained localization of perturbations. To address this challenge, we propose AttackTracer, the first semantic-level localization framework specifically designed for instance-level adversarial attacks. Instance-level adversarial perturbations are typically sparse and localized, which aligns naturally with the capabilities of diffusion models to progressively reconstruct sparse structures from stochastic noise. Building on this property, AttackTracer models the adversarial mask as a conditional distribution given the adversarial image, allowing iterative refinement and effective recovery of attack regions. To address the inherent instability of diffusion sampling, we introduce the Temporal Evidence Fusion Strategy (TEFS). TEFS integrates Dempster–Shafer theory with a signal-to-noise-ratio (SNR)-guided temporal ensemble, aggregating multi-step predictions to mitigate conflicts and uncertainty, thus achieving robust inference. Furthermore, adversarial perturbations often manifest as subtle high-frequency and edge distortions. To capture these, AttackTracer employs two complementary modules: the Wavelet Frequency Fusion Block (WFFB), which extracts multi-scale frequency features via Discrete Wavelet Transform to enhance sensitivity to sparse perturbations, and the Edge Feature Enhancement Module (EFEM), which models multi-granularity edge structures using parallel branches and FFT to detect boundary distortions. Together, WFFB and EFEM provide complementary views of perturbation patterns. Extensive experiments demonstrate that AttackTracer achieves superior traceability of adversarial regions while maintaining robustness across stochastic sampling and varying scales, highlighting its effectiveness for instance-level attack localization.
期刊介绍:
Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.