Meta-RawResampler：基于模式引导的原始图像重新缩放

IF 3.1 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of Visual Communication and Image Representation Pub Date : 2025-06-18 DOI:10.1016/j.jvcir.2025.104514

Jingyun Liu , Han Zhu , Daiqin Yang , Zhenzhong Chen , Shan Liu

{"title":"Meta-RawResampler：基于模式引导的原始图像重新缩放","authors":"Jingyun Liu , Han Zhu , Daiqin Yang , Zhenzhong Chen , Shan Liu","doi":"10.1016/j.jvcir.2025.104514","DOIUrl":null,"url":null,"abstract":"<div><div>Modern digital cameras allow users to set the resolution at which images are saved. When low-resolution images (LR) are required for storage, images will be downscaled. If high-resolution images (HR) are needed subsequently, upscaling will be performed. Image rescaling aims at jointly optimizing downscaling/upscaling to achieve both visually plausible LR and high-fidelity HR. However, previous works have primarily focused on rescaling from RGB images. They cannot alleviate the errors produced during image signal processing (ISP), particularly arising from demosaicking and denoising. Such errors may propagate through the downscaling process and ultimately degrade the upscaling results. In contrast, we directly produce LR from noisy raw images, facing additional challenges due to incomplete color information and noises in raw images that hinder the encoding of texture details from high-resolution images into their LR counterparts. To this issue, Meta-RawResampler is proposed, which performs downscaling with spatial-color wise resampling kernels. The kernel weights are generated under the guidance of both pattern information and image content to facilitate the interaction between color channels. This interaction helps the model to infer information about missing colors based on other recorded colors, thereby enhancing the network’s ability to understand and further preserve high-frequency information. Moreover, a Pattern-Content Dynamic Guidance Module (PCDG) is proposed, which is decomposed into a Channel-wise Per-pixel Color Interpolation Block and a Color-wise Feature Interpolation Block. The former utilizes pattern information and image content to generate channel-wise spatial adaptive kernel weights, fully exploring color correlation between color, channel, and spatial dimensions to facilitate adaptive color interaction. Meanwhile, the latter employs color-wise convolution to further enhance the model’s ability to learn spatial information. Through these designs, our resampler can achieve upscaling results with higher fidelity. Extensive experiments validate the superiority of the proposed Meta-RawResampler both quantitatively and qualitatively.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"111 ","pages":"Article 104514"},"PeriodicalIF":3.1000,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Meta-RawResampler: Raw image rescaling based on pattern guidance\",\"authors\":\"Jingyun Liu , Han Zhu , Daiqin Yang , Zhenzhong Chen , Shan Liu\",\"doi\":\"10.1016/j.jvcir.2025.104514\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Modern digital cameras allow users to set the resolution at which images are saved. When low-resolution images (LR) are required for storage, images will be downscaled. If high-resolution images (HR) are needed subsequently, upscaling will be performed. Image rescaling aims at jointly optimizing downscaling/upscaling to achieve both visually plausible LR and high-fidelity HR. However, previous works have primarily focused on rescaling from RGB images. They cannot alleviate the errors produced during image signal processing (ISP), particularly arising from demosaicking and denoising. Such errors may propagate through the downscaling process and ultimately degrade the upscaling results. In contrast, we directly produce LR from noisy raw images, facing additional challenges due to incomplete color information and noises in raw images that hinder the encoding of texture details from high-resolution images into their LR counterparts. To this issue, Meta-RawResampler is proposed, which performs downscaling with spatial-color wise resampling kernels. The kernel weights are generated under the guidance of both pattern information and image content to facilitate the interaction between color channels. This interaction helps the model to infer information about missing colors based on other recorded colors, thereby enhancing the network’s ability to understand and further preserve high-frequency information. Moreover, a Pattern-Content Dynamic Guidance Module (PCDG) is proposed, which is decomposed into a Channel-wise Per-pixel Color Interpolation Block and a Color-wise Feature Interpolation Block. The former utilizes pattern information and image content to generate channel-wise spatial adaptive kernel weights, fully exploring color correlation between color, channel, and spatial dimensions to facilitate adaptive color interaction. Meanwhile, the latter employs color-wise convolution to further enhance the model’s ability to learn spatial information. Through these designs, our resampler can achieve upscaling results with higher fidelity. Extensive experiments validate the superiority of the proposed Meta-RawResampler both quantitatively and qualitatively.</div></div>\",\"PeriodicalId\":54755,\"journal\":{\"name\":\"Journal of Visual Communication and Image Representation\",\"volume\":\"111 \",\"pages\":\"Article 104514\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2025-06-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Visual Communication and Image Representation\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1047320325001282\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Visual Communication and Image Representation","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1047320325001282","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

现代数码相机允许用户设置图像保存的分辨率。当需要存储低分辨率图像（LR）时，图像将被缩小。如果随后需要高分辨率图像（HR），则需要进行升级。图像重缩放的目的是共同优化降/升缩放，以实现视觉上合理的LR和高保真的HR。然而，以前的工作主要集中在从RGB图像重新缩放。它们不能减轻图像信号处理（ISP）过程中产生的误差，特别是在去马赛克和去噪过程中产生的误差。这些误差可能在降尺度过程中传播，并最终降低升尺度结果。相比之下，我们直接从有噪声的原始图像中产生LR，由于原始图像中不完整的颜色信息和噪声阻碍了将高分辨率图像的纹理细节编码到其LR对应图像中，因此面临额外的挑战。针对这个问题，提出了Meta-RawResampler，它使用空间颜色重采样内核进行降尺度。在图案信息和图像内容的指导下生成核权值，方便颜色通道之间的交互。这种相互作用有助于模型根据其他记录的颜色推断关于缺失颜色的信息，从而增强网络理解和进一步保存高频信息的能力。在此基础上，提出了一种模式内容动态引导模块（PCDG），该模块被分解为逐通道逐像素颜色插值块和逐颜色特征插值块。前者利用图案信息和图像内容生成基于通道的空间自适应核权值，充分挖掘颜色、通道和空间维度之间的色彩相关性，实现自适应色彩交互。同时，后者采用逐色卷积进一步增强了模型对空间信息的学习能力。通过这些设计，我们的重采样器可以实现更高保真度的放大结果。大量的实验验证了所提出的Meta-RawResampler在定量和定性上的优越性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Meta-RawResampler: Raw image rescaling based on pattern guidance

Modern digital cameras allow users to set the resolution at which images are saved. When low-resolution images (LR) are required for storage, images will be downscaled. If high-resolution images (HR) are needed subsequently, upscaling will be performed. Image rescaling aims at jointly optimizing downscaling/upscaling to achieve both visually plausible LR and high-fidelity HR. However, previous works have primarily focused on rescaling from RGB images. They cannot alleviate the errors produced during image signal processing (ISP), particularly arising from demosaicking and denoising. Such errors may propagate through the downscaling process and ultimately degrade the upscaling results. In contrast, we directly produce LR from noisy raw images, facing additional challenges due to incomplete color information and noises in raw images that hinder the encoding of texture details from high-resolution images into their LR counterparts. To this issue, Meta-RawResampler is proposed, which performs downscaling with spatial-color wise resampling kernels. The kernel weights are generated under the guidance of both pattern information and image content to facilitate the interaction between color channels. This interaction helps the model to infer information about missing colors based on other recorded colors, thereby enhancing the network’s ability to understand and further preserve high-frequency information. Moreover, a Pattern-Content Dynamic Guidance Module (PCDG) is proposed, which is decomposed into a Channel-wise Per-pixel Color Interpolation Block and a Color-wise Feature Interpolation Block. The former utilizes pattern information and image content to generate channel-wise spatial adaptive kernel weights, fully exploring color correlation between color, channel, and spatial dimensions to facilitate adaptive color interaction. Meanwhile, the latter employs color-wise convolution to further enhance the model’s ability to learn spatial information. Through these designs, our resampler can achieve upscaling results with higher fidelity. Extensive experiments validate the superiority of the proposed Meta-RawResampler both quantitatively and qualitatively.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Visual Communication and Image Representation 工程技术-计算机：软件工程

CiteScore

5.40

自引率

11.50%

发文量

188

审稿时长

9.9 months

期刊介绍： The Journal of Visual Communication and Image Representation publishes papers on state-of-the-art visual communication and image representation, with emphasis on novel technologies and theoretical work in this multidisciplinary area of pure and applied research. The field of visual communication and image representation is considered in its broadest sense and covers both digital and analog aspects as well as processing and communication in biological visual systems.