ACM SIGGRAPH 2023 Conference Proceedings最新文献_第9页

Sketch-Guided Text-to-Image Diffusion Models 草图引导文本到图像扩散模型

ACM SIGGRAPH 2023 Conference Proceedings Pub Date : 2022-11-24 DOI: 10.1145/3588432.3591560

A. Voynov, Kfir Aberman, D. Cohen-Or

{"title":"Sketch-Guided Text-to-Image Diffusion Models","authors":"A. Voynov, Kfir Aberman, D. Cohen-Or","doi":"10.1145/3588432.3591560","DOIUrl":"https://doi.org/10.1145/3588432.3591560","url":null,"abstract":"Text-to-Image models have introduced a remarkable leap in the evolution of machine learning, demonstrating high-quality synthesis of images from a given text-prompt. However, these powerful pretrained models still lack control handles that can guide spatial properties of the synthesized images. In this work, we introduce a universal approach to guide a pretrained text-to-image diffusion model, with a spatial map from another domain (e.g., sketch) during inference time. Unlike previous works, our method does not require to train a dedicated model or a specialized encoder for the task. Our key idea is to train a Latent Guidance Predictor (LGP) - a small, per-pixel, Multi-Layer Perceptron (MLP) that maps latent features of noisy images to spatial maps, where the deep features are extracted from the core Denoising Diffusion Probabilistic Model (DDPM) network. The LGP is trained only on a few thousand images and constitutes a differential guiding map predictor, over which the loss is computed and propagated back to push the intermediate images to agree with the spatial map. The per-pixel training offers flexibility and locality which allows the technique to perform well on out-of-domain sketches, including free-hand style drawings. We take a particular focus on the sketch-to-image translation task, revealing a robust and expressive way to generate images that follow the guidance of a sketch of arbitrary style or domain.","PeriodicalId":280036,"journal":{"name":"ACM SIGGRAPH 2023 Conference Proceedings","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130523754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 67

AvatarMAV: Fast 3D Head Avatar Reconstruction Using Motion-Aware Neural Voxels AvatarMAV:快速3D头部头像重建使用运动感知神经体素

ACM SIGGRAPH 2023 Conference Proceedings Pub Date : 2022-11-23 DOI: 10.1145/3588432.3591567

Yuelang Xu, Lizhen Wang, Xiaochen Zhao, Hongwen Zhang, Yebin Liu

引用次数: 16

CLIP-PAE: Projection-Augmentation Embedding to Extract Relevant Features for a Disentangled, Interpretable and Controllable Text-Guided Face Manipulation CLIP-PAE:投影增强嵌入提取相关特征，用于解纠缠，可解释和可控的文本引导面部操作

ACM SIGGRAPH 2023 Conference Proceedings Pub Date : 2022-10-08 DOI: 10.1145/3588432.3591532

Chenliang Zhou, Fangcheng Zhong, C. Öztireli

{"title":"CLIP-PAE: Projection-Augmentation Embedding to Extract Relevant Features for a Disentangled, Interpretable and Controllable Text-Guided Face Manipulation","authors":"Chenliang Zhou, Fangcheng Zhong, C. Öztireli","doi":"10.1145/3588432.3591532","DOIUrl":"https://doi.org/10.1145/3588432.3591532","url":null,"abstract":"Recently introduced Contrastive Language-Image Pre-Training (CLIP) [Radford et al. 2021] bridges images and text by embedding them into a joint latent space. This opens the door to ample literature that aims to manipulate an input image by providing a textual explanation. However, due to the discrepancy between image and text embeddings in the joint space, using text embeddings as the optimization target often introduces undesired artifacts in the resulting images. Disentanglement, interpretability, and controllability are also hard to guarantee for manipulation. To alleviate these problems, we propose to define corpus subspaces spanned by relevant prompts to capture specific image characteristics. We introduce CLIP projection-augmentation embedding (PAE) as an optimization target to improve the performance of text-guided image manipulation. Our method is a simple and general paradigm that can be easily computed and adapted, and smoothly incorporated into any CLIP-based image manipulation algorithm. To demonstrate the effectiveness of our method, we conduct several theoretical and empirical studies. As a case study, we utilize the method for text-guided semantic face editing. We quantitatively and qualitatively demonstrate that PAE facilitates a more disentangled, interpretable, and controllable face image manipulation with state-of-the-art quality and accuracy.","PeriodicalId":280036,"journal":{"name":"ACM SIGGRAPH 2023 Conference Proceedings","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115819128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

COFS: COntrollable Furniture layout Synthesis COFS:可控家具布局综合

ACM SIGGRAPH 2023 Conference Proceedings Pub Date : 2022-05-29 DOI: 10.1145/3588432.3591561

W. Para, Paul Guerrero, N. Mitra, Peter Wonka

{"title":"COFS: COntrollable Furniture layout Synthesis","authors":"W. Para, Paul Guerrero, N. Mitra, Peter Wonka","doi":"10.1145/3588432.3591561","DOIUrl":"https://doi.org/10.1145/3588432.3591561","url":null,"abstract":"Realistic, scalable, and controllable generation of furniture layouts is essential for many applications in virtual reality, augmented reality, game development and synthetic data generation. The most successful current methods tackle this problem as a sequence generation problem which imposes a specific ordering on the elements of the layout, making it hard to exert fine-grained control over the attributes of a generated scene. Existing methods provide control through object-level conditioning, or scene completion, where generation can be conditioned on an arbitrary subset of furniture objects. However, attribute-level conditioning, where generation can be conditioned on an arbitrary subset of object attributes, is not supported. We propose COFS, a method to generate furniture layouts that enables fine-grained control through attribute-level conditioning. For example, COFS allows specifying only the scale and type of objects that should be placed in the scene and the generator chooses their positions and orientations; or the position that should be occupied by objects can be specified and the generator chooses their type, scale, orientation, etc. Our results show both qualitatively and quantitatively that we significantly outperform existing methods on attribute-level conditioning.","PeriodicalId":280036,"journal":{"name":"ACM SIGGRAPH 2023 Conference Proceedings","volume":"173 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124236956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

High-Order Incremental Potential Contact for Elastodynamic Simulation on Curved Meshes 曲面网格弹性动力学模拟的高阶增量势能接触

ACM SIGGRAPH 2023 Conference Proceedings Pub Date : 2022-05-27 DOI: 10.1145/3588432.3591488

Z. Ferguson, Pranav Jain, D. Zorin, T. Schneider, Daniele Panozzo

{"title":"High-Order Incremental Potential Contact for Elastodynamic Simulation on Curved Meshes","authors":"Z. Ferguson, Pranav Jain, D. Zorin, T. Schneider, Daniele Panozzo","doi":"10.1145/3588432.3591488","DOIUrl":"https://doi.org/10.1145/3588432.3591488","url":null,"abstract":"High-order bases provide major advantages over linear ones in terms of efficiency, as they provide (for the same physical model) higher accuracy for the same running time, and reliability, as they are less affected by locking artifacts and mesh quality. Thus, we introduce a high-order finite element (FE) formulation (high-order bases) for elastodynamic simulation on high-order (curved) meshes with contact handling based on the recently proposed Incremental Potential Contact (IPC) model. Our approach is based on the observation that each IPC optimization step used to minimize the elasticity, contact, and friction potentials leads to linear trajectories even in the presence of nonlinear meshes or nonlinear FE bases. It is thus possible to retain the strong non-penetration guarantees and large time steps of the original formulation while benefiting from the high-order bases and high-order geometry. We accomplish this by mapping displacements and resulting contact forces between a linear collision proxy and the underlying high-order representation. We demonstrate the effectiveness of our approach in a selection of problems from graphics, computational fabrication, and scientific computing.","PeriodicalId":280036,"journal":{"name":"ACM SIGGRAPH 2023 Conference Proceedings","volume":"601 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132789223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

ACM SIGGRAPH 2023 Conference Proceedings ACM SIGGRAPH 2023会议论文集

ACM SIGGRAPH 2023 Conference Proceedings Pub Date : 1900-01-01 DOI: 10.1145/3588432

引用次数: 0