{"title":"A Benchmark for UAV-View Natural Language-Guided Tracking","authors":"Hengyou Li, Xinyan Liu, Guorong Li","doi":"10.3390/electronics13091706","DOIUrl":null,"url":null,"abstract":"We propose a new benchmark, UAVNLT (Unmanned Aerial Vehicle Natural Language Tracking), for the UAV-view natural language-guided tracking task. UAVNLT consists of videos taken from UAV cameras from four cities for vehicles on city roads. For each video, vehicles’ bounding boxes, trajectories, and natural language are carefully annotated. Compared to the existing data sets, which are only annotated with bounding boxes, the natural language sentences in our data set can be more suitable for many application fields where humans take part in the system for that language, being not only more friendly for human–computer interaction but also capable of overcoming the appearance features’ low uniqueness for tracking. We tested several existing methods on our new benchmarks and found that the performance of the existing methods was not satisfactory. To pave the way for future work, we propose a baseline method suitable for this task, achieving state-of-the-art performance. We believe our new data set and proposed baseline method will be helpful in many fields, such as smart city, smart transportation, vehicle management, etc.","PeriodicalId":11646,"journal":{"name":"Electronics","volume":"41 1","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2024-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Electronics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.3390/electronics13091706","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
We propose a new benchmark, UAVNLT (Unmanned Aerial Vehicle Natural Language Tracking), for the UAV-view natural language-guided tracking task. UAVNLT consists of videos taken from UAV cameras from four cities for vehicles on city roads. For each video, vehicles’ bounding boxes, trajectories, and natural language are carefully annotated. Compared to the existing data sets, which are only annotated with bounding boxes, the natural language sentences in our data set can be more suitable for many application fields where humans take part in the system for that language, being not only more friendly for human–computer interaction but also capable of overcoming the appearance features’ low uniqueness for tracking. We tested several existing methods on our new benchmarks and found that the performance of the existing methods was not satisfactory. To pave the way for future work, we propose a baseline method suitable for this task, achieving state-of-the-art performance. We believe our new data set and proposed baseline method will be helpful in many fields, such as smart city, smart transportation, vehicle management, etc.
ElectronicsComputer Science-Computer Networks and Communications
CiteScore
1.10
自引率
10.30%
发文量
3515
审稿时长
16.71 days
期刊介绍:
Electronics (ISSN 2079-9292; CODEN: ELECGJ) is an international, open access journal on the science of electronics and its applications published quarterly online by MDPI.