Learn A Compression for Objection Detection - VAE with a Bridge

2021 International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2021-12-05 DOI:10.1109/VCIP53242.2021.9675387

Yixin Mei, Fan Li, Li Li, Zhu Li

{"title":"Learn A Compression for Objection Detection - VAE with a Bridge","authors":"Yixin Mei, Fan Li, Li Li, Zhu Li","doi":"10.1109/VCIP53242.2021.9675387","DOIUrl":null,"url":null,"abstract":"Recent advances in sensor technology and wide deployment of visual sensors lead to a new application whereas compression of images are not mainly for pixel recovery for human consumption, instead it is for communication to cloud side machine vision tasks like classification, identification, detection and tracking. This opens up new research dimensions for a learning based compression that directly optimizes loss function in vision tasks, and therefore achieves better compression performance vis-a-vis the pixel recovery and then performing vision tasks computing. In this work, we developed a learning based compression scheme that learns a compact feature representation and appropriate bitstreams for the task of visual object detection. Variational Auto-Encoder (VAE) framework is adopted for learning a compact representation, while a bridge network is trained to drive the detection loss function. Simulation results demonstrate that this approach is achieving a new state-of-the-art in task driven compression efficiency, compared with pixel recovery approaches, including both learning based and handcrafted solutions.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Visual Communications and Image Processing (VCIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VCIP53242.2021.9675387","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Recent advances in sensor technology and wide deployment of visual sensors lead to a new application whereas compression of images are not mainly for pixel recovery for human consumption, instead it is for communication to cloud side machine vision tasks like classification, identification, detection and tracking. This opens up new research dimensions for a learning based compression that directly optimizes loss function in vision tasks, and therefore achieves better compression performance vis-a-vis the pixel recovery and then performing vision tasks computing. In this work, we developed a learning based compression scheme that learns a compact feature representation and appropriate bitstreams for the task of visual object detection. Variational Auto-Encoder (VAE) framework is adopted for learning a compact representation, while a bridge network is trained to drive the detection loss function. Simulation results demonstrate that this approach is achieving a new state-of-the-art in task driven compression efficiency, compared with pixel recovery approaches, including both learning based and handcrafted solutions.

查看原文本刊更多论文

学习一个压缩目标检测- VAE与桥

传感器技术的最新进展和视觉传感器的广泛部署导致了一种新的应用，而图像压缩主要不是为了人类消费的像素恢复，而是为了与云端的机器视觉任务(如分类、识别、检测和跟踪)进行通信。这为基于学习的压缩开辟了新的研究维度，直接优化视觉任务中的损失函数，从而获得更好的压缩性能，相对于像素恢复，然后执行视觉任务计算。在这项工作中，我们开发了一种基于学习的压缩方案，该方案为视觉目标检测任务学习紧凑的特征表示和适当的比特流。采用变分自编码器(VAE)框架学习压缩表示，训练桥式网络驱动检测损失函数。仿真结果表明，与基于学习和手工制作的解决方案的像素恢复方法相比，该方法在任务驱动的压缩效率方面达到了新的水平。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 International Conference on Visual Communications and Image Processing (VCIP)

自引率

0.00%

发文量