Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models

IF 20.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2022-11-03 DOI:10.48550/arXiv.2211.02048

Muyang Li, Ji Lin, Chenlin Meng, Stefano Ermon, Song Han, Jun-Yan Zhu

{"title":"Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models","authors":"Muyang Li, Ji Lin, Chenlin Meng, Stefano Ermon, Song Han, Jun-Yan Zhu","doi":"10.48550/arXiv.2211.02048","DOIUrl":null,"url":null,"abstract":"During image editing, existing deep generative models tend to re-synthesize the entire output from scratch, including the unedited regions. This leads to a significant waste of computation, especially for minor editing operations. In this work, we present Spatially Sparse Inference (SSI), a general-purpose technique that selectively performs computation for edited regions and accelerates various generative models, including both conditional GANs and diffusion models. Our key observation is that users prone to gradually edit the input image. This motivates us to cache and reuse the feature maps of the original image. Given an edited image, we sparsely apply the convolutional filters to the edited regions while reusing the cached features for the unedited areas. Based on our algorithm, we further propose Sparse Incremental Generative Engine (SIGE) to convert the computation reduction to latency reduction on off-the-shelf hardware. With about 1%-area edits, SIGE accelerates DDPM by 3.0× on NVIDIA RTX 3090 and 4.6× on Apple M1 Pro GPU, Stable Diffusion by 7.2× on 3090, and GauGAN by 5.6× on 3090 and 5.2× on M1 Pro GPU. layers and apply it to Stable Diffusion. Additionally, we offer support for Apple M1 Pro GPU and include more results to substantiate the efficacy of our method.","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":" ","pages":""},"PeriodicalIF":20.8000,"publicationDate":"2022-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Pattern Analysis and Machine Intelligence","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.48550/arXiv.2211.02048","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 14

Abstract

During image editing, existing deep generative models tend to re-synthesize the entire output from scratch, including the unedited regions. This leads to a significant waste of computation, especially for minor editing operations. In this work, we present Spatially Sparse Inference (SSI), a general-purpose technique that selectively performs computation for edited regions and accelerates various generative models, including both conditional GANs and diffusion models. Our key observation is that users prone to gradually edit the input image. This motivates us to cache and reuse the feature maps of the original image. Given an edited image, we sparsely apply the convolutional filters to the edited regions while reusing the cached features for the unedited areas. Based on our algorithm, we further propose Sparse Incremental Generative Engine (SIGE) to convert the computation reduction to latency reduction on off-the-shelf hardware. With about 1%-area edits, SIGE accelerates DDPM by 3.0× on NVIDIA RTX 3090 and 4.6× on Apple M1 Pro GPU, Stable Diffusion by 7.2× on 3090, and GauGAN by 5.6× on 3090 and 5.2× on M1 Pro GPU. layers and apply it to Stable Diffusion. Additionally, we offer support for Apple M1 Pro GPU and include more results to substantiate the efficacy of our method.

查看原文本刊更多论文

条件GANs和扩散模型的有效空间稀疏推理

在图像编辑过程中，现有的深度生成模型倾向于从头开始重新合成整个输出，包括未编辑的区域。这导致了计算的巨大浪费，尤其是对于较小的编辑操作。在这项工作中，我们提出了空间稀疏推理（SSI），这是一种通用技术，可以选择性地对编辑区域执行计算，并加速各种生成模型，包括条件GANs和扩散模型。我们的主要观察结果是，用户倾向于逐渐编辑输入图像。这促使我们缓存和重用原始图像的特征图。给定一个编辑过的图像，我们稀疏地将卷积滤波器应用于编辑过的区域，同时为未编辑的区域重用缓存的特征。基于我们的算法，我们进一步提出了稀疏增量生成引擎（SIGE），以将现有硬件上的计算减少转换为延迟减少。通过约1%的面积编辑，SIGE在NVIDIA RTX 3090和Apple M1 Pro GPU上分别将DDPM加速3.0倍和4.6倍，在3090和GauGAN上分别将Stable Diffusion加速7.2倍和5.2倍。层并将其应用于稳定扩散。此外，我们提供对Apple M1 Pro GPU的支持，并提供更多结果来证实我们方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Pattern Analysis and Machine Intelligence 工程技术-工程：电子与电气

CiteScore

28.40

自引率

3.00%

发文量

885

审稿时长

8.5 months

期刊介绍： The IEEE Transactions on Pattern Analysis and Machine Intelligence publishes articles on all traditional areas of computer vision and image understanding, all traditional areas of pattern analysis and recognition, and selected areas of machine intelligence, with a particular emphasis on machine learning for pattern analysis. Areas such as techniques for visual search, document and handwriting analysis, medical image analysis, video and image sequence analysis, content-based retrieval of image and video, face and gesture recognition and relevant specialized hardware and/or software architectures are also covered.