{"title":"从机器人角度重新思考时间目标检测","authors":"Xingyu Chen, Zhengxing Wu, Junzhi Yu, Li Wen","doi":"10.1201/9781003144281-5","DOIUrl":null,"url":null,"abstract":"Video object detection (VID) has been vigorously studied for years but almost all literature adopts a static accuracy-based evaluation, i.e., average precision (AP). From a robotic perspective, the importance of recall continuity and localization stability is equal to that of accuracy, but the AP is insufficient to reflect detectors' performance across time. In this paper, non-reference assessments are proposed for continuity and stability based on object tracklets. These temporal evaluations can serve as supplements to static AP. Further, we develop an online tracklet refinement for improving detectors' temporal performance through short tracklet suppression, fragment filling, and temporal location fusion. \nIn addition, we propose a small-overlap suppression to extend VID methods to single object tracking (SOT) task so that a flexible SOT-by-detection framework is then formed. \nExtensive experiments are conducted on ImageNet VID dataset and real-world robotic tasks, where the superiority of our proposed approaches are validated and verified. Codes will be publicly available.","PeriodicalId":413618,"journal":{"name":"Visual Perception and Control of Underwater Robots","volume":"50 3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Rethinking Temporal Object Detection from Robotic Perspectives\",\"authors\":\"Xingyu Chen, Zhengxing Wu, Junzhi Yu, Li Wen\",\"doi\":\"10.1201/9781003144281-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Video object detection (VID) has been vigorously studied for years but almost all literature adopts a static accuracy-based evaluation, i.e., average precision (AP). From a robotic perspective, the importance of recall continuity and localization stability is equal to that of accuracy, but the AP is insufficient to reflect detectors' performance across time. In this paper, non-reference assessments are proposed for continuity and stability based on object tracklets. These temporal evaluations can serve as supplements to static AP. Further, we develop an online tracklet refinement for improving detectors' temporal performance through short tracklet suppression, fragment filling, and temporal location fusion. \\nIn addition, we propose a small-overlap suppression to extend VID methods to single object tracking (SOT) task so that a flexible SOT-by-detection framework is then formed. \\nExtensive experiments are conducted on ImageNet VID dataset and real-world robotic tasks, where the superiority of our proposed approaches are validated and verified. Codes will be publicly available.\",\"PeriodicalId\":413618,\"journal\":{\"name\":\"Visual Perception and Control of Underwater Robots\",\"volume\":\"50 3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Visual Perception and Control of Underwater Robots\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1201/9781003144281-5\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Visual Perception and Control of Underwater Robots","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1201/9781003144281-5","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Rethinking Temporal Object Detection from Robotic Perspectives
Video object detection (VID) has been vigorously studied for years but almost all literature adopts a static accuracy-based evaluation, i.e., average precision (AP). From a robotic perspective, the importance of recall continuity and localization stability is equal to that of accuracy, but the AP is insufficient to reflect detectors' performance across time. In this paper, non-reference assessments are proposed for continuity and stability based on object tracklets. These temporal evaluations can serve as supplements to static AP. Further, we develop an online tracklet refinement for improving detectors' temporal performance through short tracklet suppression, fragment filling, and temporal location fusion.
In addition, we propose a small-overlap suppression to extend VID methods to single object tracking (SOT) task so that a flexible SOT-by-detection framework is then formed.
Extensive experiments are conducted on ImageNet VID dataset and real-world robotic tasks, where the superiority of our proposed approaches are validated and verified. Codes will be publicly available.