{"title":"利用光流引导流感知增强实时目标检测","authors":"Tongbo Wang;Lin Zhu;Hua Huang","doi":"10.1109/TCSVT.2025.3525796","DOIUrl":null,"url":null,"abstract":"Real-time object detection in Unmanned Aerial Vehicle (UAV) videos remains a significant challenge due to the fast motion and small scale of objects. Existing streaming perception models struggle to accurately capture fine-grained motion cues between consecutive frames, leading to suboptimal performance in dynamic UAV scenarios. To address these challenges, StreamFlow is proposed to integrate optical flow information and enhance real-time object detection in UAV videos. StreamFlow incorporates Flow-Guided Dynamic Prediction (FGDP) to refine position predictions using local optical flow information and Optical Flow Guided Optimization (OFGO) to optimize model parameters considering both localization loss and optical flow reliability. Central to OFGO is the Adaptive Flow Weighting (AFW) module, which focuses on reliable flow samples during training. The proposed integration of optical flow and adaptive weighting scheme significantly enhances the ability of streaming perception models to handle fast-moving objects in dynamic UAV environments. Extensive experiments on four challenging UAV video datasets demonstrate the superior performance of StreamFlow compared to state-of-the-art methods in terms of accuracy.","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"35 5","pages":"4816-4830"},"PeriodicalIF":8.3000,"publicationDate":"2025-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhancing Real-Time Object Detection With Optical Flow-Guided Streaming Perception\",\"authors\":\"Tongbo Wang;Lin Zhu;Hua Huang\",\"doi\":\"10.1109/TCSVT.2025.3525796\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Real-time object detection in Unmanned Aerial Vehicle (UAV) videos remains a significant challenge due to the fast motion and small scale of objects. Existing streaming perception models struggle to accurately capture fine-grained motion cues between consecutive frames, leading to suboptimal performance in dynamic UAV scenarios. To address these challenges, StreamFlow is proposed to integrate optical flow information and enhance real-time object detection in UAV videos. StreamFlow incorporates Flow-Guided Dynamic Prediction (FGDP) to refine position predictions using local optical flow information and Optical Flow Guided Optimization (OFGO) to optimize model parameters considering both localization loss and optical flow reliability. Central to OFGO is the Adaptive Flow Weighting (AFW) module, which focuses on reliable flow samples during training. The proposed integration of optical flow and adaptive weighting scheme significantly enhances the ability of streaming perception models to handle fast-moving objects in dynamic UAV environments. Extensive experiments on four challenging UAV video datasets demonstrate the superior performance of StreamFlow compared to state-of-the-art methods in terms of accuracy.\",\"PeriodicalId\":13082,\"journal\":{\"name\":\"IEEE Transactions on Circuits and Systems for Video Technology\",\"volume\":\"35 5\",\"pages\":\"4816-4830\"},\"PeriodicalIF\":8.3000,\"publicationDate\":\"2025-01-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Circuits and Systems for Video Technology\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10824896/\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems for Video Technology","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10824896/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
Enhancing Real-Time Object Detection With Optical Flow-Guided Streaming Perception
Real-time object detection in Unmanned Aerial Vehicle (UAV) videos remains a significant challenge due to the fast motion and small scale of objects. Existing streaming perception models struggle to accurately capture fine-grained motion cues between consecutive frames, leading to suboptimal performance in dynamic UAV scenarios. To address these challenges, StreamFlow is proposed to integrate optical flow information and enhance real-time object detection in UAV videos. StreamFlow incorporates Flow-Guided Dynamic Prediction (FGDP) to refine position predictions using local optical flow information and Optical Flow Guided Optimization (OFGO) to optimize model parameters considering both localization loss and optical flow reliability. Central to OFGO is the Adaptive Flow Weighting (AFW) module, which focuses on reliable flow samples during training. The proposed integration of optical flow and adaptive weighting scheme significantly enhances the ability of streaming perception models to handle fast-moving objects in dynamic UAV environments. Extensive experiments on four challenging UAV video datasets demonstrate the superior performance of StreamFlow compared to state-of-the-art methods in terms of accuracy.
期刊介绍:
The IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) is dedicated to covering all aspects of video technologies from a circuits and systems perspective. We encourage submissions of general, theoretical, and application-oriented papers related to image and video acquisition, representation, presentation, and display. Additionally, we welcome contributions in areas such as processing, filtering, and transforms; analysis and synthesis; learning and understanding; compression, transmission, communication, and networking; as well as storage, retrieval, indexing, and search. Furthermore, papers focusing on hardware and software design and implementation are highly valued. Join us in advancing the field of video technology through innovative research and insights.