{"title":"Linearly Transformed Color Guide for Low-Bitrate Diffusion-Based Image Compression","authors":"Tom Bordin;Thomas Maugey","doi":"10.1109/TIP.2024.3521301","DOIUrl":null,"url":null,"abstract":"This study addresses the challenge of controlling the global color aspect of images generated by a diffusion model without training or fine-tuning. We rewrite the guidance equations to ensure that the outputs are closer to a known color map, without compromising the quality of the generation. Our method results in new guidance equations. In the context of color guidance, we show that the scaling of the guidance should not decrease but rather increase throughout the diffusion process. In a second contribution, our guidance is applied in a compression framework, where we combine both semantic and general color information of the image to decode at very low cost. We show that our method is effective in improving the fidelity and realism of compressed images at extremely low bit rates (<inline-formula> <tex-math>$10^{-2}$ </tex-math></inline-formula>bpp), performing better on these criteria when compared to other classical or more semantically oriented approaches. The implementation of our method is available on gitlab at <uri>https://gitlab.inria.fr/tbordin/color-guidance</uri>.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"468-482"},"PeriodicalIF":0.0000,"publicationDate":"2024-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10818510/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This study addresses the challenge of controlling the global color aspect of images generated by a diffusion model without training or fine-tuning. We rewrite the guidance equations to ensure that the outputs are closer to a known color map, without compromising the quality of the generation. Our method results in new guidance equations. In the context of color guidance, we show that the scaling of the guidance should not decrease but rather increase throughout the diffusion process. In a second contribution, our guidance is applied in a compression framework, where we combine both semantic and general color information of the image to decode at very low cost. We show that our method is effective in improving the fidelity and realism of compressed images at extremely low bit rates ($10^{-2}$ bpp), performing better on these criteria when compared to other classical or more semantically oriented approaches. The implementation of our method is available on gitlab at https://gitlab.inria.fr/tbordin/color-guidance.