{"title":"Spatiotemporal Object Detection for Improved Aerial Vehicle Detection in Traffic Monitoring","authors":"Kristina Telegraph;Christos Kyrkou","doi":"10.1109/TAI.2024.3454566","DOIUrl":null,"url":null,"abstract":"This work presents advancements in multiclass vehicle detection using unmanned aerial vehicle (UAV) cameras through the development of spatiotemporal object detection models. The study introduces a spatiotemporal vehicle detection dataset (STVD) containing \n<inline-formula><tex-math>$6600$</tex-math></inline-formula>\n annotated sequential frame images captured by UAVs, enabling comprehensive training and evaluation of algorithms for holistic spatiotemporal perception. A YOLO-based object detection algorithm is enhanced to incorporate temporal dynamics, resulting in improved performance over single frame models. The integration of attention mechanisms into spatiotemporal models is shown to further enhance performance. Experimental validation demonstrates significant progress, with the best spatiotemporal model exhibiting a 16.22% improvement over single frame models, while it is demonstrated that attention mechanisms hold the potential for additional performance gains.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 12","pages":"6159-6171"},"PeriodicalIF":0.0000,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10666729/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This work presents advancements in multiclass vehicle detection using unmanned aerial vehicle (UAV) cameras through the development of spatiotemporal object detection models. The study introduces a spatiotemporal vehicle detection dataset (STVD) containing
$6600$
annotated sequential frame images captured by UAVs, enabling comprehensive training and evaluation of algorithms for holistic spatiotemporal perception. A YOLO-based object detection algorithm is enhanced to incorporate temporal dynamics, resulting in improved performance over single frame models. The integration of attention mechanisms into spatiotemporal models is shown to further enhance performance. Experimental validation demonstrates significant progress, with the best spatiotemporal model exhibiting a 16.22% improvement over single frame models, while it is demonstrated that attention mechanisms hold the potential for additional performance gains.