{"title":"I❤MESH: A DSL for Mesh Processing","authors":"Yong Li, Shoaib Kamil, Keenan Crane, Alec Jacobson, Yotam Gingold","doi":"10.1145/3662181","DOIUrl":"https://doi.org/10.1145/3662181","url":null,"abstract":"<p>Mesh processing algorithms are often communicated via concise mathematical notation (e.g., summation over mesh neighborhoods). However, conversion of notation into working code remains a time consuming and error-prone process which requires arcane knowledge of low-level data structures and libraries—impeding rapid exploration of high-level algorithms. We address this problem by introducing a domain-specific language (DSL) for mesh processing called I❤MESH, which resembles notation commonly used in visual and geometric computing, and automates the process of converting notation into code. The centerpiece of our language is a flexible notation for specifying and manipulating neighborhoods of a cell complex, internally represented via standard operations on sparse boundary matrices. This layered design enables natural expression of algorithms while minimizing demands on a code generation back-end. In particular, by integrating I❤MESH with the linear algebra features of the I❤LA DSL, and adding support for automatic differentiation, we can rapidly implement a rich variety of algorithms on point clouds, surface meshes, and volume meshes.</p>","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"4 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140817849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Taras Kucherenko, Pieter Wolfert, Youngwoo Yoon, Carla Viegas, Teodor Nikolov, Mihail Tsakov, Gustav Eje Henter
{"title":"Evaluating gesture generation in a large-scale open challenge: The GENEA Challenge 2022","authors":"Taras Kucherenko, Pieter Wolfert, Youngwoo Yoon, Carla Viegas, Teodor Nikolov, Mihail Tsakov, Gustav Eje Henter","doi":"10.1145/3656374","DOIUrl":"https://doi.org/10.1145/3656374","url":null,"abstract":"<p>This paper reports on the second GENEA Challenge to benchmark data-driven automatic co-speech gesture generation. Participating teams used the same speech and motion dataset to build gesture-generation systems. Motion generated by all these systems was rendered to video using a standardised visualisation pipeline and evaluated in several large, crowdsourced user studies. Unlike when comparing different research papers, differences in results are here only due to differences between methods, enabling direct comparison between systems. The dataset was based on 18 hours of full-body motion capture, including fingers, of different persons engaging in a dyadic conversation. Ten teams participated in the challenge across two tiers: full-body and upper-body gesticulation. For each tier, we evaluated both the human-likeness of the gesture motion and its appropriateness for the specific speech signal. Our evaluations decouple human-likeness from gesture appropriateness, which has been a difficult problem in the field. </p><p>The evaluation results show some synthetic gesture conditions being rated as significantly more human-like than 3D human motion capture. To the best of our knowledge, this has not been demonstrated before. On the other hand, all synthetic motion is found to be vastly less appropriate for the speech than the original motion-capture recordings. We also find that conventional objective metrics do not correlate well with subjective human-likeness ratings in this large evaluation. The one exception is the Fréchet gesture distance (FGD), which achieves a Kendall’s tau rank correlation of around (-0.5). Based on the challenge results we formulate numerous recommendations for system building and evaluation.</p>","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"9 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140651568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Differentiable solver for time-dependent deformation problems with contact","authors":"Zizhou Huang, Davi Colli Tozoni, Arvi Gjoka, Zachary Ferguson, Teseo Schneider, Daniele Panozzo, Denis Zorin","doi":"10.1145/3657648","DOIUrl":"https://doi.org/10.1145/3657648","url":null,"abstract":"<p>We introduce a general differentiable solver for time-dependent deformation problems with contact and friction. Our approach uses a finite element discretization with a high-order time integrator coupled with the recently proposed incremental potential contact method for handling contact and friction forces to solve ODE- and PDE-constrained optimization problems on scenes with complex geometry. It supports static and dynamic problems and differentiation with respect to all physical parameters involved in the physical problem description, which include shape, material parameters, friction parameters, and initial conditions. Our analytically derived adjoint formulation is efficient, with a small overhead (typically less than 10% for nonlinear problems) over the forward simulation, and shares many similarities with the forward problem, allowing the reuse of large parts of existing forward simulator code. </p><p>We implement our approach on top of the open-source PolyFEM library and demonstrate the applicability of our solver to shape design, initial condition optimization, and material estimation on both simulated results and physical validations.</p>","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"8 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140651314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tizian Zeltner, Fabrice Rousselle, Andrea Weidlich, Petrik Clarberg, Jan Novák, Benedikt Bitterli, Alex Evans, Tomáš Davidovič, Simon Kallweit, Aaron Lefohn
{"title":"Real-Time Neural Appearance Models","authors":"Tizian Zeltner, Fabrice Rousselle, Andrea Weidlich, Petrik Clarberg, Jan Novák, Benedikt Bitterli, Alex Evans, Tomáš Davidovič, Simon Kallweit, Aaron Lefohn","doi":"10.1145/3659577","DOIUrl":"https://doi.org/10.1145/3659577","url":null,"abstract":"<p>We present a complete system for real-time rendering of scenes with complex appearance previously reserved for offline use. This is achieved with a combination of algorithmic and system level innovations. </p><p>Our appearance model utilizes learned hierarchical textures that are interpreted using neural decoders, which produce reflectance values and importance-sampled directions. To best utilize the modeling capacity of the decoders, we equip the decoders with two graphics priors. The first prior—transformation of directions into learned shading frames—facilitates accurate reconstruction of mesoscale effects. The second prior—a microfacet sampling distribution—allows the neural decoder to perform importance sampling efficiently. The resulting appearance model supports anisotropic sampling and level-of-detail rendering, and allows baking deeply layered material graphs into a compact unified neural representation. </p><p>By exposing hardware accelerated tensor operations to ray tracing shaders, we show that it is possible to inline and execute the neural decoders efficiently inside a real-time path tracer. We analyze scalability with increasing number of neural materials and propose to improve performance using code optimized for coherent and divergent execution. Our neural material shaders can be over an order of magnitude faster than non-neural layered materials. This opens up the door for using film-quality visuals in real-time applications such as games and live previews.</p>","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"16 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140621586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Elad Richardson, Kfir Goldberg, Yuval Alaluf, Daniel Cohen-Or
{"title":"ConceptLab: Creative Concept Generation using VLM-Guided Diffusion Prior Constraints","authors":"Elad Richardson, Kfir Goldberg, Yuval Alaluf, Daniel Cohen-Or","doi":"10.1145/3659578","DOIUrl":"https://doi.org/10.1145/3659578","url":null,"abstract":"<p>Recent text-to-image generative models have enabled us to transform our words into vibrant, captivating imagery. The surge of personalization techniques that has followed has also allowed us to imagine unique concepts in new scenes. However, an intriguing question remains: How can we generate a <i>new</i>, imaginary concept that has never been seen before? In this paper, we present the task of <i>creative text-to-image generation</i>, where we seek to generate new members of a broad category (e.g., generating a pet that differs from all existing pets). We leverage the under-studied Diffusion Prior models and show that the creative generation problem can be formulated as an optimization process over the output space of the diffusion prior, resulting in a set of “prior constraints”. To keep our generated concept from converging into existing members, we incorporate a question-answering Vision-Language Model (VLM) that adaptively adds new constraints to the optimization problem, encouraging the model to discover increasingly more unique creations. Finally, we show that our prior constraints can also serve as a strong mixing mechanism allowing us to create hybrids between generated concepts, introducing even more flexibility into the creative process.</p>","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"25 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140557143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haipeng Li, Hai Jiang, Ao Luo, Ping Tan, Haoqiang Fan, Bing Zeng, Shuaicheng Liu
{"title":"DMHomo: Learning Homography with Diffusion Models","authors":"Haipeng Li, Hai Jiang, Ao Luo, Ping Tan, Haoqiang Fan, Bing Zeng, Shuaicheng Liu","doi":"10.1145/3652207","DOIUrl":"https://doi.org/10.1145/3652207","url":null,"abstract":"<p>Supervised homography estimation methods face a challenge due to the lack of adequate labeled training data. To address this issue, we propose DMHomo, a diffusion model-based framework for supervised homography learning. This framework generates image pairs with accurate labels, realistic image content, and realistic interval motion, ensuring they satisfy adequate pairs. We utilize unlabeled image pairs with pseudo-labels such as homography and dominant plane masks, computed from existing methods, to train a diffusion model that generates a supervised training dataset. To further enhance performance, we introduce a new probabilistic mask loss, which identifies outlier regions through supervised training, and an iterative mechanism to optimize the generative and homography models successively. Our experimental results demonstrate that DMHomo effectively overcomes the scarcity of qualified datasets in supervised homography learning and improves generalization to real-world scenes. The code and dataset are available at: https://github.com/lhaippp/DMHomo</p>","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"107 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140096891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Joint Stroke Tracing and Correspondence for 2D Animation","authors":"Haoran Mo, Chengying Gao, Ruomei Wang","doi":"10.1145/3649890","DOIUrl":"https://doi.org/10.1145/3649890","url":null,"abstract":"<p>To alleviate human labor in redrawing keyframes with ordered vector strokes for automatic inbetweening, we for the first time propose a joint stroke tracing and correspondence approach. Given consecutive raster keyframes along with a single vector image of the starting frame as a guidance, the approach generates vector drawings for the remaining keyframes while ensuring one-to-one stroke correspondence. Our framework trained on clean line drawings generalizes to rough sketches and the generated results can be imported into inbetweening systems to produce inbetween sequences. Hence, the method is compatible with standard 2D animation workflow. An adaptive spatial transformation module (ASTM) is introduced to handle non-rigid motions and stroke distortion. We collect a dataset for training, with 10k+ pairs of raster frames and their vector drawings with stroke correspondence. Comprehensive validations on real clean and rough animated frames manifest the effectiveness of our method and superiority to existing methods.</p>","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"52 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-02-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139994136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shusen Liu, Xiaowei He, Yuzhong Guo, Yue Chang, Wencheng Wang
{"title":"A Dual-Particle Approach for Incompressible SPH Fluids","authors":"Shusen Liu, Xiaowei He, Yuzhong Guo, Yue Chang, Wencheng Wang","doi":"10.1145/3649888","DOIUrl":"https://doi.org/10.1145/3649888","url":null,"abstract":"<p>Tensile instability is one of the major obstacles to particle methods in fluid simulation, which would cause particles to clump in pairs under tension and prevent fluid simulation to generate small-scale thin features. To address this issue, previous particle methods either use a background pressure or a finite difference scheme to alleviate the particle clustering artifacts, yet still fail to produce small-scale thin features in free-surface flows. In this paper, we propose a dual-particle approach for simulating incompressible fluids. Our approach involves incorporating supplementary virtual particles designed to capture and store particle pressures. These pressure samples undergo systematic redistribution at each time step, grounded in the initial positions of the fluid particles. By doing so, we effectively reduce tensile instability in standard SPH by narrowing down the unstable regions for particles experiencing tensile stress. As a result, we can accurately simulate free-surface flows with rich small-scale thin features, such as droplets, streamlines, and sheets, as demonstrated by experimental results.</p>","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"48 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-02-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139994048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kartik Teotia, Mallikarjun B R, Xingang Pan, Hyeongwoo Kim, Pablo Garrido, Mohamed Elgharib, Christian Theobalt
{"title":"HQ3DAvatar: High Quality Implicit 3D Head Avatar","authors":"Kartik Teotia, Mallikarjun B R, Xingang Pan, Hyeongwoo Kim, Pablo Garrido, Mohamed Elgharib, Christian Theobalt","doi":"10.1145/3649889","DOIUrl":"https://doi.org/10.1145/3649889","url":null,"abstract":"<p>Multi-view volumetric rendering techniques have recently shown great potential in modeling and synthesizing high-quality head avatars. A common approach to capture full head dynamic performances is to track the underlying geometry using a mesh-based template or 3D cube-based graphics primitives. While these model-based approaches achieve promising results, they often fail to learn complex geometric details such as the mouth interior, hair, and topological changes over time. This paper presents a novel approach to building highly photorealistic digital head avatars. Our method learns a canonical space via an implicit function parameterized by a neural network. It leverages multiresolution hash encoding in the learned feature space, allowing for high-quality, faster training and high-resolution rendering. At test time, our method is driven by a monocular RGB video. Here, an image encoder extracts face-specific features that also condition the learnable canonical space. This encourages deformation-dependent texture variations during training. We also propose a novel optical flow based loss that ensures correspondences in the learned canonical space, thus encouraging artifact-free and temporally consistent renderings. We show results on challenging facial expressions and show free-viewpoint renderings at interactive real-time rates for a resolution of 480<i>x</i>270. Our method outperforms related approaches both visually and numerically. We will release our multiple-identity dataset to encourage further research.</p>","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"15 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-02-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140001070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Online Neural Path Guiding with Normalized Anisotropic Spherical Gaussians","authors":"Jiawei Huang, Akito Iizuka, Hajime Tanaka, Taku Komura, Yoshifumi Kitamura","doi":"10.1145/3649310","DOIUrl":"https://doi.org/10.1145/3649310","url":null,"abstract":"<p>Importance sampling techniques significantly reduce variance in physically-based rendering. In this paper we propose a novel online framework to learn the spatial-varying distribution of the full product of the rendering equation, with a single small neural network using stochastic ray samples. The learned distributions can be used to efficiently sample the full product of incident light. To accomplish this, we introduce a novel closed-form density model, called the Normalized Anisotropic Spherical Gaussian mixture, that can model a complex light field with a small number of parameters and that can be directly sampled. Our framework progressively renders and learns the distribution, without requiring any warm-up phases. With the compact and expressive representation of our density model, our framework can be implemented entirely on the GPU, allowing it to produce high-quality images with limited computational resources. The results show that our framework outperforms existing neural path guiding approaches and achieves comparable or even better performance than state-of-the-art online statistical path guiding techniques.</p>","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"27 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139994036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}