从原始图像到后端任务的高效频域视觉管道

2023 IEEE International Symposium on Circuits and Systems (ISCAS) Pub Date : 2023-05-21 DOI:10.1109/ISCAS46773.2023.10182018

Hao Li, Weiti Zhou, Xiangyu Zhang, Xin Lou

{"title":"从原始图像到后端任务的高效频域视觉管道","authors":"Hao Li, Weiti Zhou, Xiangyu Zhang, Xin Lou","doi":"10.1109/ISCAS46773.2023.10182018","DOIUrl":null,"url":null,"abstract":"Though high resolution benefits computer vision performance, they are not commonly used in convolutional neural network (CNN)-based vision algorithms due to the limitation of memory and computation resource. Learning in the frequency domain makes high resolution images directly acceptable by CNNs, but the computation, time and energy overhead for pre-processing, including image signal processing (ISP) and domain transformation, can be large. This paper explores different image processing and domain transformation operations and proposes an efficient end-to-end frequency domain learning pipeline from RAW images to vision tasks. In particular, we simplify the pre-processing part by skipping the entire ISP pipeline and replacing the Discrete Cosine Transform (DCT) with a multiplication-free approximated one. Experimental results show that the final vision performance of the proposed pipeline is very close to that of the conventional pipeline, while significant amount of redundant operations can be saved.","PeriodicalId":177320,"journal":{"name":"2023 IEEE International Symposium on Circuits and Systems (ISCAS)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Efficient Frequency Domain Vision Pipeline From RAW Images to Backend Tasks\",\"authors\":\"Hao Li, Weiti Zhou, Xiangyu Zhang, Xin Lou\",\"doi\":\"10.1109/ISCAS46773.2023.10182018\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Though high resolution benefits computer vision performance, they are not commonly used in convolutional neural network (CNN)-based vision algorithms due to the limitation of memory and computation resource. Learning in the frequency domain makes high resolution images directly acceptable by CNNs, but the computation, time and energy overhead for pre-processing, including image signal processing (ISP) and domain transformation, can be large. This paper explores different image processing and domain transformation operations and proposes an efficient end-to-end frequency domain learning pipeline from RAW images to vision tasks. In particular, we simplify the pre-processing part by skipping the entire ISP pipeline and replacing the Discrete Cosine Transform (DCT) with a multiplication-free approximated one. Experimental results show that the final vision performance of the proposed pipeline is very close to that of the conventional pipeline, while significant amount of redundant operations can be saved.\",\"PeriodicalId\":177320,\"journal\":{\"name\":\"2023 IEEE International Symposium on Circuits and Systems (ISCAS)\",\"volume\":\"33 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE International Symposium on Circuits and Systems (ISCAS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISCAS46773.2023.10182018\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Symposium on Circuits and Systems (ISCAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCAS46773.2023.10182018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

虽然高分辨率有利于计算机视觉性能，但由于内存和计算资源的限制，高分辨率在基于卷积神经网络(CNN)的视觉算法中并不常用。频域学习使cnn可以直接接受高分辨率图像，但预处理的计算量、时间和能量开销，包括图像信号处理(ISP)和域变换，可能会很大。本文探讨了不同的图像处理和域变换操作，提出了一种从RAW图像到视觉任务的端到端频域学习管道。特别是，我们通过跳过整个ISP管道并用无乘法的近似变换替换离散余弦变换(DCT)来简化预处理部分。实验结果表明，本文提出的管道最终视觉性能与传统管道非常接近，同时可以节省大量的冗余操作。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An Efficient Frequency Domain Vision Pipeline From RAW Images to Backend Tasks

Though high resolution benefits computer vision performance, they are not commonly used in convolutional neural network (CNN)-based vision algorithms due to the limitation of memory and computation resource. Learning in the frequency domain makes high resolution images directly acceptable by CNNs, but the computation, time and energy overhead for pre-processing, including image signal processing (ISP) and domain transformation, can be large. This paper explores different image processing and domain transformation operations and proposes an efficient end-to-end frequency domain learning pipeline from RAW images to vision tasks. In particular, we simplify the pre-processing part by skipping the entire ISP pipeline and replacing the Discrete Cosine Transform (DCT) with a multiplication-free approximated one. Experimental results show that the final vision performance of the proposed pipeline is very close to that of the conventional pipeline, while significant amount of redundant operations can be saved.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 IEEE International Symposium on Circuits and Systems (ISCAS)

自引率

0.00%

发文量