基于压缩测量的视频目标跟踪与分类

Proceedings of the 3rd International Conference on Vision, Image and Signal Processing Pub Date : 2019-08-26 DOI:10.1145/3387168.3387188

C. Kwan

{"title":"基于压缩测量的视频目标跟踪与分类","authors":"C. Kwan","doi":"10.1145/3387168.3387188","DOIUrl":null,"url":null,"abstract":"In this paper, we summarize some recent results on objective tracking and classification in infrared and low quality videos using compressive measurements. Two compressive measurement modes were investigated. One was based on subsampling of the original measurements. The other was based on coded aperture camera. It is important to emphasize that conventional trackers require the compressive measurements be reconstructed first before any tracking and classification processing steps begin. The reconstruction is time-consuming and may also lose information. Our proposed approach directly uses compressive measurements and a deep learning tracker known as You Only Look Once (YOLO), which is fast and can track multiple objects simultaneously, was used to track objects. The detected objects are then recognized using another deep learning model called residual network (ResNet). Extensive experiments using infrared videos from long distances were conducted. Results show that the proposed approach performs much better than conventional trackers, which failed to deal with compressive measurements. Instead, ResNet classifier performs better than the built-in classifier in YOLO.","PeriodicalId":346739,"journal":{"name":"Proceedings of the 3rd International Conference on Vision, Image and Signal Processing","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Object Tracking and Classification in Videos Using Compressive Measurements\",\"authors\":\"C. Kwan\",\"doi\":\"10.1145/3387168.3387188\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we summarize some recent results on objective tracking and classification in infrared and low quality videos using compressive measurements. Two compressive measurement modes were investigated. One was based on subsampling of the original measurements. The other was based on coded aperture camera. It is important to emphasize that conventional trackers require the compressive measurements be reconstructed first before any tracking and classification processing steps begin. The reconstruction is time-consuming and may also lose information. Our proposed approach directly uses compressive measurements and a deep learning tracker known as You Only Look Once (YOLO), which is fast and can track multiple objects simultaneously, was used to track objects. The detected objects are then recognized using another deep learning model called residual network (ResNet). Extensive experiments using infrared videos from long distances were conducted. Results show that the proposed approach performs much better than conventional trackers, which failed to deal with compressive measurements. Instead, ResNet classifier performs better than the built-in classifier in YOLO.\",\"PeriodicalId\":346739,\"journal\":{\"name\":\"Proceedings of the 3rd International Conference on Vision, Image and Signal Processing\",\"volume\":\"34 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-08-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 3rd International Conference on Vision, Image and Signal Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3387168.3387188\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 3rd International Conference on Vision, Image and Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3387168.3387188","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

本文综述了近年来利用压缩测量技术对红外和低质量视频进行目标跟踪和分类的一些研究成果。研究了两种压缩测量模式。一种是基于原始测量的子抽样。另一种是基于编码光圈相机。需要强调的是，传统跟踪器要求在任何跟踪和分类处理步骤开始之前首先重建压缩测量值。重建耗时长，且可能丢失信息。我们提出的方法直接使用压缩测量和称为You Only Look Once (YOLO)的深度学习跟踪器来跟踪对象，该跟踪器速度快，可以同时跟踪多个对象。然后使用另一种称为残余网络(ResNet)的深度学习模型识别检测到的对象。利用远距离红外视频进行了广泛的实验。结果表明，该方法比传统的跟踪器性能好得多，传统的跟踪器不能处理压缩测量。相反，ResNet分类器比YOLO中的内置分类器性能更好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Object Tracking and Classification in Videos Using Compressive Measurements

In this paper, we summarize some recent results on objective tracking and classification in infrared and low quality videos using compressive measurements. Two compressive measurement modes were investigated. One was based on subsampling of the original measurements. The other was based on coded aperture camera. It is important to emphasize that conventional trackers require the compressive measurements be reconstructed first before any tracking and classification processing steps begin. The reconstruction is time-consuming and may also lose information. Our proposed approach directly uses compressive measurements and a deep learning tracker known as You Only Look Once (YOLO), which is fast and can track multiple objects simultaneously, was used to track objects. The detected objects are then recognized using another deep learning model called residual network (ResNet). Extensive experiments using infrared videos from long distances were conducted. Results show that the proposed approach performs much better than conventional trackers, which failed to deal with compressive measurements. Instead, ResNet classifier performs better than the built-in classifier in YOLO.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 3rd International Conference on Vision, Image and Signal Processing

自引率

0.00%

发文量