Multi-Model UNet: An Adversarial Defense Mechanism for Robust Visual Tracking

IF 2.8 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Processing Letters Pub Date : 2024-04-01 DOI:10.1007/s11063-024-11592-2

Wattanapong Suttapak, Jianfu Zhang, Haohuo Zhao, Liqing Zhang

{"title":"Multi-Model UNet: An Adversarial Defense Mechanism for Robust Visual Tracking","authors":"Wattanapong Suttapak, Jianfu Zhang, Haohuo Zhao, Liqing Zhang","doi":"10.1007/s11063-024-11592-2","DOIUrl":null,"url":null,"abstract":"<p>Currently, state-of-the-art object-tracking algorithms are facing a severe threat from adversarial attacks, which can significantly undermine their performance. In this research, we introduce MUNet, a novel defensive model designed for visual tracking. This model is capable of generating defensive images that can effectively counter attacks while maintaining a low computational overhead. To achieve this, we experiment with various configurations of MUNet models, finding that even a minimal three-layer setup significantly improves tracking robustness when the target tracker is under attack. Each model undergoes end-to-end training on randomly paired images, which include both clean and adversarial noise images. This training separately utilizes pixel-wise denoiser and feature-wise defender. Our proposed models significantly enhance tracking performance even when the target tracker is attacked or the target frame is clean. Additionally, MUNet can simultaneously share its parameters on both template and search regions. In experimental results, the proposed models successfully defend against top attackers on six benchmark datasets, including OTB100, LaSOT, UAV123, VOT2018, VOT2019, and GOT-10k. Performance results on all datasets show a significant improvement over all attackers, with a decline of less than 4.6% for every benchmark metric compared to the original tracker. Notably, our model demonstrates the ability to enhance tracking robustness in other blackbox trackers.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"89 1","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Processing Letters","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11063-024-11592-2","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Currently, state-of-the-art object-tracking algorithms are facing a severe threat from adversarial attacks, which can significantly undermine their performance. In this research, we introduce MUNet, a novel defensive model designed for visual tracking. This model is capable of generating defensive images that can effectively counter attacks while maintaining a low computational overhead. To achieve this, we experiment with various configurations of MUNet models, finding that even a minimal three-layer setup significantly improves tracking robustness when the target tracker is under attack. Each model undergoes end-to-end training on randomly paired images, which include both clean and adversarial noise images. This training separately utilizes pixel-wise denoiser and feature-wise defender. Our proposed models significantly enhance tracking performance even when the target tracker is attacked or the target frame is clean. Additionally, MUNet can simultaneously share its parameters on both template and search regions. In experimental results, the proposed models successfully defend against top attackers on six benchmark datasets, including OTB100, LaSOT, UAV123, VOT2018, VOT2019, and GOT-10k. Performance results on all datasets show a significant improvement over all attackers, with a decline of less than 4.6% for every benchmark metric compared to the original tracker. Notably, our model demonstrates the ability to enhance tracking robustness in other blackbox trackers.

查看原文本刊更多论文

多模型 UNet：鲁棒视觉跟踪的对抗性防御机制

目前，最先进的物体跟踪算法正面临着对抗性攻击的严重威胁，这些攻击会大大降低算法的性能。在这项研究中，我们介绍了一种专为视觉跟踪设计的新型防御模型 MUNet。该模型能够生成防御图像，有效抵御攻击，同时保持较低的计算开销。为此，我们尝试了 MUNet 模型的各种配置，发现即使是最小的三层设置，也能在目标跟踪器受到攻击时显著提高跟踪鲁棒性。每个模型都要在随机配对的图像上进行端到端训练，其中包括干净的图像和敌意噪声图像。这种训练分别使用像素去噪器和特征防御器。即使目标跟踪器受到攻击或目标帧是干净的，我们提出的模型也能大大提高跟踪性能。此外，MUNet 可以同时在模板和搜索区域共享参数。在实验结果中，所提出的模型成功抵御了六个基准数据集上的顶级攻击者，包括 OTB100、LaSOT、UAV123、VOT2018、VOT2019 和 GOT-10k。在所有数据集上的性能结果表明，与所有攻击者相比都有显著提高，与原始跟踪器相比，每个基准指标的下降幅度都小于 4.6%。值得注意的是，我们的模型展示了在其他黑盒跟踪器中增强跟踪鲁棒性的能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Neural Processing Letters 工程技术-计算机：人工智能

CiteScore

4.90

自引率

12.90%

发文量

392

审稿时长

2.8 months

期刊介绍： Neural Processing Letters is an international journal publishing research results and innovative ideas on all aspects of artificial neural networks. Coverage includes theoretical developments, biological models, new formal modes, learning, applications, software and hardware developments, and prospective researches. The journal promotes fast exchange of information in the community of neural network researchers and users. The resurgence of interest in the field of artificial neural networks since the beginning of the 1980s is coupled to tremendous research activity in specialized or multidisciplinary groups. Research, however, is not possible without good communication between people and the exchange of information, especially in a field covering such different areas; fast communication is also a key aspect, and this is the reason for Neural Processing Letters