ACM SIGGRAPH 2023 Conference Proceedings最新文献

筛选
英文 中文
Sketch-Guided Text-to-Image Diffusion Models 草图引导文本到图像扩散模型
ACM SIGGRAPH 2023 Conference Proceedings Pub Date : 2022-11-24 DOI: 10.1145/3588432.3591560
A. Voynov, Kfir Aberman, D. Cohen-Or
{"title":"Sketch-Guided Text-to-Image Diffusion Models","authors":"A. Voynov, Kfir Aberman, D. Cohen-Or","doi":"10.1145/3588432.3591560","DOIUrl":"https://doi.org/10.1145/3588432.3591560","url":null,"abstract":"Text-to-Image models have introduced a remarkable leap in the evolution of machine learning, demonstrating high-quality synthesis of images from a given text-prompt. However, these powerful pretrained models still lack control handles that can guide spatial properties of the synthesized images. In this work, we introduce a universal approach to guide a pretrained text-to-image diffusion model, with a spatial map from another domain (e.g., sketch) during inference time. Unlike previous works, our method does not require to train a dedicated model or a specialized encoder for the task. Our key idea is to train a Latent Guidance Predictor (LGP) - a small, per-pixel, Multi-Layer Perceptron (MLP) that maps latent features of noisy images to spatial maps, where the deep features are extracted from the core Denoising Diffusion Probabilistic Model (DDPM) network. The LGP is trained only on a few thousand images and constitutes a differential guiding map predictor, over which the loss is computed and propagated back to push the intermediate images to agree with the spatial map. The per-pixel training offers flexibility and locality which allows the technique to perform well on out-of-domain sketches, including free-hand style drawings. We take a particular focus on the sketch-to-image translation task, revealing a robust and expressive way to generate images that follow the guidance of a sketch of arbitrary style or domain.","PeriodicalId":280036,"journal":{"name":"ACM SIGGRAPH 2023 Conference Proceedings","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130523754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 67
AvatarMAV: Fast 3D Head Avatar Reconstruction Using Motion-Aware Neural Voxels AvatarMAV:快速3D头部头像重建使用运动感知神经体素
ACM SIGGRAPH 2023 Conference Proceedings Pub Date : 2022-11-23 DOI: 10.1145/3588432.3591567
Yuelang Xu, Lizhen Wang, Xiaochen Zhao, Hongwen Zhang, Yebin Liu
{"title":"AvatarMAV: Fast 3D Head Avatar Reconstruction Using Motion-Aware Neural Voxels","authors":"Yuelang Xu, Lizhen Wang, Xiaochen Zhao, Hongwen Zhang, Yebin Liu","doi":"10.1145/3588432.3591567","DOIUrl":"https://doi.org/10.1145/3588432.3591567","url":null,"abstract":"With NeRF widely used for facial reenactment, recent methods can recover photo-realistic 3D head avatar from just a monocular video. Unfortunately, the training process of the NeRF-based methods is quite time-consuming, as MLP used in the NeRF-based methods is inefficient and requires too many iterations to converge. To overcome this problem, we propose AvatarMAV, a fast 3D head avatar reconstruction method using Motion-Aware Neural Voxels. AvatarMAV is the first to model both the canonical appearance and the decoupled expression motion by neural voxels for head avatar. In particular, the motion-aware neural voxels is generated from the weighted concatenation of multiple 4D tensors. The 4D tensors semantically correspond one-to-one with 3DMM expression basis and share the same weights as 3DMM expression coefficients. Benefiting from our novel representation, the proposed AvatarMAV can recover photo-realistic head avatars in just 5 minutes (implemented with pure PyTorch), which is significantly faster than the state-of-the-art facial reenactment methods. Project page: https://www.liuyebin.com/avatarmav.","PeriodicalId":280036,"journal":{"name":"ACM SIGGRAPH 2023 Conference Proceedings","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126549491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
CLIP-PAE: Projection-Augmentation Embedding to Extract Relevant Features for a Disentangled, Interpretable and Controllable Text-Guided Face Manipulation CLIP-PAE:投影增强嵌入提取相关特征,用于解纠缠,可解释和可控的文本引导面部操作
ACM SIGGRAPH 2023 Conference Proceedings Pub Date : 2022-10-08 DOI: 10.1145/3588432.3591532
Chenliang Zhou, Fangcheng Zhong, C. Öztireli
{"title":"CLIP-PAE: Projection-Augmentation Embedding to Extract Relevant Features for a Disentangled, Interpretable and Controllable Text-Guided Face Manipulation","authors":"Chenliang Zhou, Fangcheng Zhong, C. Öztireli","doi":"10.1145/3588432.3591532","DOIUrl":"https://doi.org/10.1145/3588432.3591532","url":null,"abstract":"Recently introduced Contrastive Language-Image Pre-Training (CLIP) [Radford et al. 2021] bridges images and text by embedding them into a joint latent space. This opens the door to ample literature that aims to manipulate an input image by providing a textual explanation. However, due to the discrepancy between image and text embeddings in the joint space, using text embeddings as the optimization target often introduces undesired artifacts in the resulting images. Disentanglement, interpretability, and controllability are also hard to guarantee for manipulation. To alleviate these problems, we propose to define corpus subspaces spanned by relevant prompts to capture specific image characteristics. We introduce CLIP projection-augmentation embedding (PAE) as an optimization target to improve the performance of text-guided image manipulation. Our method is a simple and general paradigm that can be easily computed and adapted, and smoothly incorporated into any CLIP-based image manipulation algorithm. To demonstrate the effectiveness of our method, we conduct several theoretical and empirical studies. As a case study, we utilize the method for text-guided semantic face editing. We quantitatively and qualitatively demonstrate that PAE facilitates a more disentangled, interpretable, and controllable face image manipulation with state-of-the-art quality and accuracy.","PeriodicalId":280036,"journal":{"name":"ACM SIGGRAPH 2023 Conference Proceedings","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115819128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
COFS: COntrollable Furniture layout Synthesis COFS:可控家具布局综合
ACM SIGGRAPH 2023 Conference Proceedings Pub Date : 2022-05-29 DOI: 10.1145/3588432.3591561
W. Para, Paul Guerrero, N. Mitra, Peter Wonka
{"title":"COFS: COntrollable Furniture layout Synthesis","authors":"W. Para, Paul Guerrero, N. Mitra, Peter Wonka","doi":"10.1145/3588432.3591561","DOIUrl":"https://doi.org/10.1145/3588432.3591561","url":null,"abstract":"Realistic, scalable, and controllable generation of furniture layouts is essential for many applications in virtual reality, augmented reality, game development and synthetic data generation. The most successful current methods tackle this problem as a sequence generation problem which imposes a specific ordering on the elements of the layout, making it hard to exert fine-grained control over the attributes of a generated scene. Existing methods provide control through object-level conditioning, or scene completion, where generation can be conditioned on an arbitrary subset of furniture objects. However, attribute-level conditioning, where generation can be conditioned on an arbitrary subset of object attributes, is not supported. We propose COFS, a method to generate furniture layouts that enables fine-grained control through attribute-level conditioning. For example, COFS allows specifying only the scale and type of objects that should be placed in the scene and the generator chooses their positions and orientations; or the position that should be occupied by objects can be specified and the generator chooses their type, scale, orientation, etc. Our results show both qualitatively and quantitatively that we significantly outperform existing methods on attribute-level conditioning.","PeriodicalId":280036,"journal":{"name":"ACM SIGGRAPH 2023 Conference Proceedings","volume":"173 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124236956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
High-Order Incremental Potential Contact for Elastodynamic Simulation on Curved Meshes 曲面网格弹性动力学模拟的高阶增量势能接触
ACM SIGGRAPH 2023 Conference Proceedings Pub Date : 2022-05-27 DOI: 10.1145/3588432.3591488
Z. Ferguson, Pranav Jain, D. Zorin, T. Schneider, Daniele Panozzo
{"title":"High-Order Incremental Potential Contact for Elastodynamic Simulation on Curved Meshes","authors":"Z. Ferguson, Pranav Jain, D. Zorin, T. Schneider, Daniele Panozzo","doi":"10.1145/3588432.3591488","DOIUrl":"https://doi.org/10.1145/3588432.3591488","url":null,"abstract":"High-order bases provide major advantages over linear ones in terms of efficiency, as they provide (for the same physical model) higher accuracy for the same running time, and reliability, as they are less affected by locking artifacts and mesh quality. Thus, we introduce a high-order finite element (FE) formulation (high-order bases) for elastodynamic simulation on high-order (curved) meshes with contact handling based on the recently proposed Incremental Potential Contact (IPC) model. Our approach is based on the observation that each IPC optimization step used to minimize the elasticity, contact, and friction potentials leads to linear trajectories even in the presence of nonlinear meshes or nonlinear FE bases. It is thus possible to retain the strong non-penetration guarantees and large time steps of the original formulation while benefiting from the high-order bases and high-order geometry. We accomplish this by mapping displacements and resulting contact forces between a linear collision proxy and the underlying high-order representation. We demonstrate the effectiveness of our approach in a selection of problems from graphics, computational fabrication, and scientific computing.","PeriodicalId":280036,"journal":{"name":"ACM SIGGRAPH 2023 Conference Proceedings","volume":"601 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132789223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
ACM SIGGRAPH 2023 Conference Proceedings ACM SIGGRAPH 2023会议论文集
ACM SIGGRAPH 2023 Conference Proceedings Pub Date : 1900-01-01 DOI: 10.1145/3588432
{"title":"ACM SIGGRAPH 2023 Conference Proceedings","authors":"","doi":"10.1145/3588432","DOIUrl":"https://doi.org/10.1145/3588432","url":null,"abstract":"","PeriodicalId":280036,"journal":{"name":"ACM SIGGRAPH 2023 Conference Proceedings","volume":"2015 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128064802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信