Learning in Compressed Domain for Faster Machine Vision Tasks

2021 International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2021-12-05 DOI:10.1109/VCIP53242.2021.9675369

Jinming Liu, Heming Sun, J. Katto

引用次数: 7

Abstract

Learned image compression (LIC) has illustrated good ability for reconstruction quality driven tasks (e.g. PSNR, MS-SSIM) and machine vision tasks such as image understanding. However, most LIC frameworks are based on pixel domain, which requires the decoding process. In this paper, we develop a learned compressed domain framework for machine vision tasks. 1) By sending the compressed latent representation directly to the task network, the decoding computation can be eliminated to reduce the complexity. 2) By sorting the latent channels by entropy, only selective channels will be transmitted to the task network, which can reduce the bitrate. As a result, compared with the traditional pixel domain methods, we can reduce about 1/3 multiply-add operations (MACs) and 1/5 inference time while keeping the same accuracy. Moreover, proposed channel selection can contribute to at most 6.8% bitrate saving.

查看原文本刊更多论文

基于压缩域的快速机器视觉学习

学习图像压缩(LIC)在重建质量驱动的任务(如PSNR, MS-SSIM)和机器视觉任务(如图像理解)中表现出良好的能力。然而，大多数LIC框架都是基于像素域的，这需要解码过程。在本文中，我们开发了一个用于机器视觉任务的学习压缩域框架。1)通过将压缩后的隐表示直接发送到任务网络，可以消除解码计算，降低复杂度。2)通过对潜在信道进行熵排序，只将有选择的信道传输到任务网络，从而降低比特率。结果表明，与传统的像素域方法相比，在保持相同精度的情况下，我们可以减少约1/3的乘法加运算(mac)和1/5的推理时间。此外，所提出的信道选择最多可以节省6.8%的比特率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 International Conference on Visual Communications and Image Processing (VCIP)

自引率

0.00%

发文量