Linearly Transformed Color Guide for Low-Bitrate Diffusion-Based Image Compression

Tom Bordin;Thomas Maugey
{"title":"Linearly Transformed Color Guide for Low-Bitrate Diffusion-Based Image Compression","authors":"Tom Bordin;Thomas Maugey","doi":"10.1109/TIP.2024.3521301","DOIUrl":null,"url":null,"abstract":"This study addresses the challenge of controlling the global color aspect of images generated by a diffusion model without training or fine-tuning. We rewrite the guidance equations to ensure that the outputs are closer to a known color map, without compromising the quality of the generation. Our method results in new guidance equations. In the context of color guidance, we show that the scaling of the guidance should not decrease but rather increase throughout the diffusion process. In a second contribution, our guidance is applied in a compression framework, where we combine both semantic and general color information of the image to decode at very low cost. We show that our method is effective in improving the fidelity and realism of compressed images at extremely low bit rates (<inline-formula> <tex-math>$10^{-2}$ </tex-math></inline-formula>bpp), performing better on these criteria when compared to other classical or more semantically oriented approaches. The implementation of our method is available on gitlab at <uri>https://gitlab.inria.fr/tbordin/color-guidance</uri>.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"468-482"},"PeriodicalIF":0.0000,"publicationDate":"2024-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10818510/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

This study addresses the challenge of controlling the global color aspect of images generated by a diffusion model without training or fine-tuning. We rewrite the guidance equations to ensure that the outputs are closer to a known color map, without compromising the quality of the generation. Our method results in new guidance equations. In the context of color guidance, we show that the scaling of the guidance should not decrease but rather increase throughout the diffusion process. In a second contribution, our guidance is applied in a compression framework, where we combine both semantic and general color information of the image to decode at very low cost. We show that our method is effective in improving the fidelity and realism of compressed images at extremely low bit rates ( $10^{-2}$ bpp), performing better on these criteria when compared to other classical or more semantically oriented approaches. The implementation of our method is available on gitlab at https://gitlab.inria.fr/tbordin/color-guidance.
基于低比特率扩散的图像压缩的线性变换颜色指南
本研究解决了在没有训练或微调的情况下控制扩散模型生成的图像的全局颜色方面的挑战。我们重写了引导方程,以确保输出更接近已知的颜色映射,而不影响生成的质量。我们的方法得到了新的制导方程。在颜色引导的背景下,我们证明了在整个扩散过程中,引导的尺度不应该减少而应该增加。在第二个贡献中,我们的指导被应用在压缩框架中,我们将图像的语义和一般颜色信息结合起来,以非常低的成本进行解码。我们表明,我们的方法在极低比特率($10^{-2}$ bpp)下有效地提高了压缩图像的保真度和真实感,与其他经典或更面向语义的方法相比,在这些标准上表现得更好。我们的方法的实现可以在gitlab的https://gitlab.inria.fr/tbordin/color-guidance上获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信