An Efficient Frequency Domain Vision Pipeline From RAW Images to Backend Tasks

2023 IEEE International Symposium on Circuits and Systems (ISCAS) Pub Date : 2023-05-21 DOI:10.1109/ISCAS46773.2023.10182018

Hao Li, Weiti Zhou, Xiangyu Zhang, Xin Lou

引用次数: 0

Abstract

Though high resolution benefits computer vision performance, they are not commonly used in convolutional neural network (CNN)-based vision algorithms due to the limitation of memory and computation resource. Learning in the frequency domain makes high resolution images directly acceptable by CNNs, but the computation, time and energy overhead for pre-processing, including image signal processing (ISP) and domain transformation, can be large. This paper explores different image processing and domain transformation operations and proposes an efficient end-to-end frequency domain learning pipeline from RAW images to vision tasks. In particular, we simplify the pre-processing part by skipping the entire ISP pipeline and replacing the Discrete Cosine Transform (DCT) with a multiplication-free approximated one. Experimental results show that the final vision performance of the proposed pipeline is very close to that of the conventional pipeline, while significant amount of redundant operations can be saved.

查看原文本刊更多论文

从原始图像到后端任务的高效频域视觉管道

虽然高分辨率有利于计算机视觉性能，但由于内存和计算资源的限制，高分辨率在基于卷积神经网络(CNN)的视觉算法中并不常用。频域学习使cnn可以直接接受高分辨率图像，但预处理的计算量、时间和能量开销，包括图像信号处理(ISP)和域变换，可能会很大。本文探讨了不同的图像处理和域变换操作，提出了一种从RAW图像到视觉任务的端到端频域学习管道。特别是，我们通过跳过整个ISP管道并用无乘法的近似变换替换离散余弦变换(DCT)来简化预处理部分。实验结果表明，本文提出的管道最终视觉性能与传统管道非常接近，同时可以节省大量的冗余操作。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2023 IEEE International Symposium on Circuits and Systems (ISCAS)

自引率

0.00%

发文量