MVP-HOT: A Moderate Visual Prompt for Hyperspectral Object Tracking

IF 2.6 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of Visual Communication and Image Representation Pub Date : 2024-10-26 DOI:10.1016/j.jvcir.2024.104326

Lin Zhao, Shaoxiong Xie, Jia Li, Ping Tan, Wenjin Hu

{"title":"MVP-HOT: A Moderate Visual Prompt for Hyperspectral Object Tracking","authors":"Lin Zhao, Shaoxiong Xie, Jia Li, Ping Tan, Wenjin Hu","doi":"10.1016/j.jvcir.2024.104326","DOIUrl":null,"url":null,"abstract":"<div><div>The growing attention to hyperspectral object tracking (HOT) can be attributed to the extended spectral information available in hyperspectral images (HSIs), especially in complex scenarios. This potential makes it a promising alternative to traditional RGB-based tracking methods. However, the scarcity of large hyperspectral datasets poses a challenge for training robust hyperspectral trackers using deep learning methods. Prompt learning, a new paradigm emerging in large language models, involves adapting or fine-tuning a pre-trained model for a specific downstream task by providing task-specific inputs. Inspired by the recent success of prompt learning in language and visual tasks, we propose a novel and efficient prompt learning method for HOT tasks, termed Moderate Visual Prompt for HOT (MVP-HOT). Specifically, MVP-HOT freezes the parameters of the pre-trained model and employs HSIs as visual prompts to leverage the knowledge of the underlying RGB model. Additionally, we develop a moderate and effective strategy to incrementally adapt the HSI prompt information. Our proposed method uses only a few (1.7M) learnable parameters and demonstrates its effectiveness through extensive experiments, MVP-HOT can achieve state-of-the-art performance on three hyperspectral datasets.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"105 ","pages":"Article 104326"},"PeriodicalIF":2.6000,"publicationDate":"2024-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Visual Communication and Image Representation","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1047320324002827","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

The growing attention to hyperspectral object tracking (HOT) can be attributed to the extended spectral information available in hyperspectral images (HSIs), especially in complex scenarios. This potential makes it a promising alternative to traditional RGB-based tracking methods. However, the scarcity of large hyperspectral datasets poses a challenge for training robust hyperspectral trackers using deep learning methods. Prompt learning, a new paradigm emerging in large language models, involves adapting or fine-tuning a pre-trained model for a specific downstream task by providing task-specific inputs. Inspired by the recent success of prompt learning in language and visual tasks, we propose a novel and efficient prompt learning method for HOT tasks, termed Moderate Visual Prompt for HOT (MVP-HOT). Specifically, MVP-HOT freezes the parameters of the pre-trained model and employs HSIs as visual prompts to leverage the knowledge of the underlying RGB model. Additionally, we develop a moderate and effective strategy to incrementally adapt the HSI prompt information. Our proposed method uses only a few (1.7M) learnable parameters and demonstrates its effectiveness through extensive experiments, MVP-HOT can achieve state-of-the-art performance on three hyperspectral datasets.

查看原文本刊更多论文

MVP-HOT：用于高光谱物体跟踪的适度视觉提示

高光谱物体追踪（HOT）之所以越来越受到关注，是因为高光谱图像（HSIs）中具有扩展的光谱信息，尤其是在复杂的场景中。这种潜力使其成为传统的基于 RGB 的跟踪方法的一种有前途的替代方法。然而，大型高光谱数据集的稀缺给使用深度学习方法训练稳健的高光谱跟踪器带来了挑战。提示学习是大型语言模型中出现的一种新范式，它通过提供特定任务的输入，针对特定下游任务调整或微调预先训练好的模型。受近期提示学习在语言和视觉任务中取得成功的启发，我们为 HOT 任务提出了一种新颖高效的提示学习方法，即适度视觉提示 HOT（MVP-HOT）。具体来说，MVP-HOT 冻结了预训练模型的参数，并采用 HSI 作为视觉提示，以充分利用底层 RGB 模型的知识。此外，我们还开发了一种适度而有效的策略来逐步调整 HSI 提示信息。我们提出的方法只使用了少量（1.7M）可学习参数，并通过大量实验证明了其有效性，MVP-HOT 可以在三个高光谱数据集上实现最先进的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Visual Communication and Image Representation 工程技术-计算机：软件工程

CiteScore

5.40

自引率

11.50%

发文量

188

审稿时长

9.9 months

期刊介绍： The Journal of Visual Communication and Image Representation publishes papers on state-of-the-art visual communication and image representation, with emphasis on novel technologies and theoretical work in this multidisciplinary area of pure and applied research. The field of visual communication and image representation is considered in its broadest sense and covers both digital and analog aspects as well as processing and communication in biological visual systems.