{"title":"TrackingMamba: Visual State Space Model for Object Tracking","authors":"Qingwang Wang;Liyao Zhou;Pengcheng Jin;Xin Qu;Hangwei Zhong;Haochen Song;Tao Shen","doi":"10.1109/JSTARS.2024.3458938","DOIUrl":null,"url":null,"abstract":"In recent years, UAV object tracking has provided technical support across various fields. Most existing work relies on convolutional neural networks (CNNs) or visual transformers. However, CNNs have limited receptive fields, resulting in suboptimal performance, while transformers require substantial computational resources, making training and inference challenging. Mountainous and jungle environments-critical components of the Earth's surface and key scenarios for UAV object tracking-present unique challenges due to steep terrain, dense vegetation, and rapidly changing weather conditions, which complicate UAV tracking. The lack of relevant datasets further reduces tracking accuracy. This article introduces a new tracking framework based on a state-space model called TrackingMamba, which uses a single-stream tracking architecture with Vision Mamba as its backbone. TrackingMamba not only matches transformer-based trackers in global feature extraction and long-range dependence modeling but also maintains computational efficiency with linear growth. Compared to other advanced trackers, TrackingMamba delivers higher accuracy with a simpler model framework, fewer parameters, and reduced FLOPs. Specifically, on the UAV123 benchmark, TrackingMamba outperforms the baseline model OSTtrack-256, improving AUC by 2.59% and Precision by 4.42%, while reducing parameters by 95.52% and FLOPs by 95.02%. The article also evaluates the performance and shortcomings of TrackingMamba and other advanced trackers in the complex and critical context of jungle environments, and it explores potential future research directions in UAV jungle object tracking.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":4.7000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10678881","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10678881/","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
In recent years, UAV object tracking has provided technical support across various fields. Most existing work relies on convolutional neural networks (CNNs) or visual transformers. However, CNNs have limited receptive fields, resulting in suboptimal performance, while transformers require substantial computational resources, making training and inference challenging. Mountainous and jungle environments-critical components of the Earth's surface and key scenarios for UAV object tracking-present unique challenges due to steep terrain, dense vegetation, and rapidly changing weather conditions, which complicate UAV tracking. The lack of relevant datasets further reduces tracking accuracy. This article introduces a new tracking framework based on a state-space model called TrackingMamba, which uses a single-stream tracking architecture with Vision Mamba as its backbone. TrackingMamba not only matches transformer-based trackers in global feature extraction and long-range dependence modeling but also maintains computational efficiency with linear growth. Compared to other advanced trackers, TrackingMamba delivers higher accuracy with a simpler model framework, fewer parameters, and reduced FLOPs. Specifically, on the UAV123 benchmark, TrackingMamba outperforms the baseline model OSTtrack-256, improving AUC by 2.59% and Precision by 4.42%, while reducing parameters by 95.52% and FLOPs by 95.02%. The article also evaluates the performance and shortcomings of TrackingMamba and other advanced trackers in the complex and critical context of jungle environments, and it explores potential future research directions in UAV jungle object tracking.
期刊介绍:
The IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing addresses the growing field of applications in Earth observations and remote sensing, and also provides a venue for the rapidly expanding special issues that are being sponsored by the IEEE Geosciences and Remote Sensing Society. The journal draws upon the experience of the highly successful “IEEE Transactions on Geoscience and Remote Sensing” and provide a complementary medium for the wide range of topics in applied earth observations. The ‘Applications’ areas encompasses the societal benefit areas of the Global Earth Observations Systems of Systems (GEOSS) program. Through deliberations over two years, ministers from 50 countries agreed to identify nine areas where Earth observation could positively impact the quality of life and health of their respective countries. Some of these are areas not traditionally addressed in the IEEE context. These include biodiversity, health and climate. Yet it is the skill sets of IEEE members, in areas such as observations, communications, computers, signal processing, standards and ocean engineering, that form the technical underpinnings of GEOSS. Thus, the Journal attracts a broad range of interests that serves both present members in new ways and expands the IEEE visibility into new areas.