Towards effective and efficient adversarial defense with diffusion models for robust visual tracking

IF 14.7 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Information Fusion Pub Date : 2025-06-20 DOI:10.1016/j.inffus.2025.103384

Long Xu , Peng Gao , Wen-Jia Tang , Fei Wang , Ru-Yue Yuan

{"title":"Towards effective and efficient adversarial defense with diffusion models for robust visual tracking","authors":"Long Xu , Peng Gao , Wen-Jia Tang , Fei Wang , Ru-Yue Yuan","doi":"10.1016/j.inffus.2025.103384","DOIUrl":null,"url":null,"abstract":"<div><div>Although deep learning-based visual tracking methods have made significant progress, they exhibit vulnerabilities when facing carefully designed adversarial attacks, which can lead to a sharp decline in tracking performance. To address this issue, this paper proposes for the first time a novel adversarial defense method based on denoise diffusion probabilistic models, termed DiffDf, aimed at effectively improving the robustness of existing visual tracking methods against adversarial attacks. DiffDf establishes a multi-scale defense mechanism by combining pixel-level reconstruction loss, semantic consistency loss, and structural similarity loss, effectively suppressing adversarial perturbations through a gradual denoising process. Extensive experimental results on several mainstream datasets show that the DiffDf method demonstrates excellent generalization performance for trackers with different architectures, significantly improving various evaluation metrics while achieving real-time inference speeds of over 30 FPS, showcasing outstanding defense performance and efficiency. Codes are available at <span><span>https://github.com/pgao-lab/DiffDf</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"124 ","pages":"Article 103384"},"PeriodicalIF":14.7000,"publicationDate":"2025-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253525004579","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Although deep learning-based visual tracking methods have made significant progress, they exhibit vulnerabilities when facing carefully designed adversarial attacks, which can lead to a sharp decline in tracking performance. To address this issue, this paper proposes for the first time a novel adversarial defense method based on denoise diffusion probabilistic models, termed DiffDf, aimed at effectively improving the robustness of existing visual tracking methods against adversarial attacks. DiffDf establishes a multi-scale defense mechanism by combining pixel-level reconstruction loss, semantic consistency loss, and structural similarity loss, effectively suppressing adversarial perturbations through a gradual denoising process. Extensive experimental results on several mainstream datasets show that the DiffDf method demonstrates excellent generalization performance for trackers with different architectures, significantly improving various evaluation metrics while achieving real-time inference speeds of over 30 FPS, showcasing outstanding defense performance and efficiency. Codes are available at https://github.com/pgao-lab/DiffDf.

查看原文本刊更多论文

利用扩散模型实现鲁棒视觉跟踪的有效对抗防御

尽管基于深度学习的视觉跟踪方法已经取得了重大进展，但在面对精心设计的对抗性攻击时，它们表现出脆弱性，这可能导致跟踪性能急剧下降。为了解决这一问题，本文首次提出了一种基于噪声扩散概率模型的新型对抗性防御方法DiffDf，旨在有效提高现有视觉跟踪方法对对抗性攻击的鲁棒性。DiffDf结合像素级重建损失、语义一致性损失和结构相似性损失建立了多尺度防御机制，通过逐步去噪过程有效抑制对抗性扰动。在多个主流数据集上的大量实验结果表明，DiffDf方法对不同架构的跟踪器具有出色的泛化性能，显著提高了各种评估指标，同时实现了超过30 FPS的实时推理速度，展示了出色的防御性能和效率。代码可在https://github.com/pgao-lab/DiffDf上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Information Fusion 工程技术-计算机：理论方法

CiteScore

33.20

自引率

4.30%

发文量

161

审稿时长

7.9 months

期刊介绍： Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.