BihoT: A Large-Scale Dataset and Benchmark for Hyperspectral Camouflaged Object Tracking

IF 8.9 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE transactions on neural networks and learning systems Pub Date : 2025-03-06 DOI:10.1109/TNNLS.2025.3564059

Hanzheng Wang;Wei Li;Xiang-Gen Xia;Qian Du

{"title":"BihoT: A Large-Scale Dataset and Benchmark for Hyperspectral Camouflaged Object Tracking","authors":"Hanzheng Wang;Wei Li;Xiang-Gen Xia;Qian Du","doi":"10.1109/TNNLS.2025.3564059","DOIUrl":null,"url":null,"abstract":"Hyperspectral object tracking (HOT) has many important applications, particularly in scenes where objects are camouflaged. The existing trackers can effectively retrieve objects via band regrouping because of the bias in the existing HOT datasets, where most objects tend to have distinguishing visual appearances rather than spectral characteristics. This bias allows a tracker to directly use the visual features obtained from the false-color images generated by hyperspectral images (HSIs) without extracting spectral features. To tackle this bias, the tracker should focus on the spectral information when object appearance is unreliable. Thus, we provide a new task called hyperspectral camouflaged object tracking (HCOT) and meticulously construct a large-scale HCOT dataset, BihoT, consisting of 41912 HSIs covering 49 video sequences. The dataset covers various artificial camouflage scenes, where objects have similar appearances, diverse spectrums, and frequent occlusion (OCC), making it a challenging dataset for HCOT. Besides, a simple but effective baseline model, named spectral prompt-based distractor-aware network (SPDAN), is proposed, comprising a spectral embedding network (SEN), a spectral prompt-based backbone network (SPBN), and a distractor-aware module (DAM). Specifically, the SEN extracts spectral-spatial features via 3-D and 2-D convolutions to form a refined prompt representation. Then, the SPBN fine-tunes powerful RGB trackers with spectral prompts and alleviates the insufficiency of training samples. Moreover, the DAM utilizes a novel statistic to capture the distractor caused by occlusion from objects and background and corrects the deterioration of the tracking performance via a novel motion predictor. Extensive experiments demonstrate that our proposed SPDAN achieves the state-of-the-art performance on the proposed BihoT and other HOT datasets.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"36 9","pages":"16392-16406"},"PeriodicalIF":8.9000,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10988886/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Hyperspectral object tracking (HOT) has many important applications, particularly in scenes where objects are camouflaged. The existing trackers can effectively retrieve objects via band regrouping because of the bias in the existing HOT datasets, where most objects tend to have distinguishing visual appearances rather than spectral characteristics. This bias allows a tracker to directly use the visual features obtained from the false-color images generated by hyperspectral images (HSIs) without extracting spectral features. To tackle this bias, the tracker should focus on the spectral information when object appearance is unreliable. Thus, we provide a new task called hyperspectral camouflaged object tracking (HCOT) and meticulously construct a large-scale HCOT dataset, BihoT, consisting of 41912 HSIs covering 49 video sequences. The dataset covers various artificial camouflage scenes, where objects have similar appearances, diverse spectrums, and frequent occlusion (OCC), making it a challenging dataset for HCOT. Besides, a simple but effective baseline model, named spectral prompt-based distractor-aware network (SPDAN), is proposed, comprising a spectral embedding network (SEN), a spectral prompt-based backbone network (SPBN), and a distractor-aware module (DAM). Specifically, the SEN extracts spectral-spatial features via 3-D and 2-D convolutions to form a refined prompt representation. Then, the SPBN fine-tunes powerful RGB trackers with spectral prompts and alleviates the insufficiency of training samples. Moreover, the DAM utilizes a novel statistic to capture the distractor caused by occlusion from objects and background and corrects the deterioration of the tracking performance via a novel motion predictor. Extensive experiments demonstrate that our proposed SPDAN achieves the state-of-the-art performance on the proposed BihoT and other HOT datasets.

查看原文本刊更多论文

高光谱伪装目标跟踪的大规模数据集和基准

高光谱目标跟踪（HOT）有许多重要的应用，特别是在物体被伪装的场景中。现有的跟踪器可以通过波段重组有效地检索目标，因为现有的热数据集存在偏差，其中大多数目标往往具有明显的视觉外观而不是光谱特征。这种偏差允许跟踪器直接使用由高光谱图像（hsi）生成的假彩色图像获得的视觉特征，而无需提取光谱特征。为了解决这种偏差，跟踪器应该在物体外观不可靠时关注光谱信息。因此，我们提出了一个新的任务，称为高光谱伪装目标跟踪（HCOT），并精心构建了一个大规模的HCOT数据集BihoT，该数据集由41912个hsi组成，涵盖49个视频序列。该数据集涵盖了各种人工伪装场景，其中物体具有相似的外观，不同的光谱和频繁遮挡（OCC），使其成为HCOT的一个具有挑战性的数据集。此外，提出了一种简单有效的基线模型——基于频谱提示的干扰物感知网络（SPDAN），该模型由频谱嵌入网络（SEN）、基于频谱提示的骨干网（SPBN）和干扰物感知模块（DAM）组成。具体来说，SEN通过3-D和2-D卷积提取频谱空间特征，形成精炼的提示表示。然后，SPBN利用光谱提示对强大的RGB跟踪器进行微调，缓解了训练样本不足的问题。此外，DAM利用一种新的统计数据来捕获由物体和背景遮挡引起的干扰，并通过一种新的运动预测器来纠正跟踪性能的恶化。大量的实验表明，我们提出的SPDAN在提出的BihoT和其他HOT数据集上达到了最先进的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on neural networks and learning systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

CiteScore

23.80

自引率

9.60%

发文量

2102

审稿时长

3-8 weeks

期刊介绍： The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.