Xiao-ran Tong, Hsueh-Ti Derek Liu, Y. Gingold, Alec Jacobson
{"title":"Differentiable Heightfield Path Tracing with Accelerated Discontinuities","authors":"Xiao-ran Tong, Hsueh-Ti Derek Liu, Y. Gingold, Alec Jacobson","doi":"10.1145/3588432.3591530","DOIUrl":"https://doi.org/10.1145/3588432.3591530","url":null,"abstract":"We investigate the problem of accelerating a physically-based differentiable renderer for heightfields based on path tracing with global illumination. On a heightfield with 1 million vertices (1024 × 1024 resolution), our differentiable renderer requires only 4 ms per sample per pixel when differentiating direct illumination, orders of magnitude faster than most existing general 3D mesh differentiable renderers. It is well-known that one can leverage spatial hierarchical data structures (e.g., the maximum mipmaps) to accelerate the forward pass of heightfield rendering. The key idea of our approach is to further utilize the hierarchy to speed up the backward pass—differentiable heightfield rendering. Specifically, we use the maximum mipmaps to accelerate the process of identifying scene discontinuities, which is crucial for obtaining accurate derivatives. Our renderer supports global illumination. we are able to optimize global effects, such as shadows, with respect to the geometry and the material parameters. Our differentiable renderer achieves real-time frame rates and unlocks interactive inverse rendering applications. We demonstrate the flexibility of our method with terrain optimization, geometric illusions, shadow optimization, and text-based shape generation.","PeriodicalId":280036,"journal":{"name":"ACM SIGGRAPH 2023 Conference Proceedings","volume":"35 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132286613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Somigliana Coordinates: an elasticity-derived approach for cage deformation","authors":"Jiong Chen, Fernando de Goes, M. Desbrun","doi":"10.1145/3588432.3591519","DOIUrl":"https://doi.org/10.1145/3588432.3591519","url":null,"abstract":"In this paper, we present a novel cage deformer based on elasticity-derived matrix-valued coordinates. In order to bypass the typical shearing artifacts and lack of volume control of existing cage deformers, we promote a more elastic behavior of the cage deformation by deriving our coordinates from the Somigliana identity, a boundary integral formulation based on the fundamental solution of linear elasticity. Given an initial cage and its deformed pose, the deformation of the cage interior is deduced from these Somigliana coordinates via a corotational scheme, resulting in a matrix-weighted combination of both vertex positions and face normals of the cage. Our deformer thus generalizes Green coordinates, while producing physically-plausible spatial deformations that are invariant under similarity transformations and with interactive bulging control. We demonstrate the efficiency and versatility of our method through a series of examples in 2D and 3D.","PeriodicalId":280036,"journal":{"name":"ACM SIGGRAPH 2023 Conference Proceedings","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134011452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dann Mensah, N. Kim, M. Aittala, S. Laine, J. Lehtinen
{"title":"A Hybrid Generator Architecture for Controllable Face Synthesis","authors":"Dann Mensah, N. Kim, M. Aittala, S. Laine, J. Lehtinen","doi":"10.1145/3588432.3591563","DOIUrl":"https://doi.org/10.1145/3588432.3591563","url":null,"abstract":"Modern data-driven image generation models often surpass traditional graphics techniques in quality. However, while traditional modeling and animation tools allow precise control over the image generation process in terms of interpretable quantities — e.g., shapes and reflectances — endowing learned models with such controls is generally difficult. In the context of human faces, we seek a data-driven generator architecture that simultaneously retains the photorealistic quality of modern generative adversarial networks (GAN) and allows explicit, disentangled controls over head shapes, expressions, identity, background, and illumination. While our high-level goal is shared by a large body of previous work, we approach the problem with a different philosophy: We treat the problem as an unconditional synthesis task, and engineer interpretable inductive biases into the model that make it easy for the desired behavior to emerge. Concretely, our generator is a combination of learned neural networks and fixed-function blocks, such as a 3D morphable head model and texture-mapping rasterizer, and we leave it up to the training process to figure out how they should be used together. This greatly simplifies the training problem by removing the need for labeled training data; we learn the distributions of the independent variables that drive the model instead of requiring that their values are known for each training image. Furthermore, we need no contrastive or imitation learning for correct behavior. We show that our design successfully encourages the generative model to make use of the internal, interpretable representations in a semantically meaningful manner. This allows sampling of different aspects of the image independently, as well as precise control of the results by manipulating the internal state of the interpretable blocks within the generator. This enables, for instance, facial animation using traditional animation tools.","PeriodicalId":280036,"journal":{"name":"ACM SIGGRAPH 2023 Conference Proceedings","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126652524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DARAM: Dynamic Avatar-Human Motion Remapping Technique for Realistic Virtual Stair Ascending Motions","authors":"Soobin Lim, Seung-hun Seo, Hyeongyeop Kang","doi":"10.1145/3588432.3591527","DOIUrl":"https://doi.org/10.1145/3588432.3591527","url":null,"abstract":"This paper introduces DARAM, a dynamic avatar-human motion remapping technique that enables VR users to ascend virtual stairs. The primary design goal is to provide a realistic sensation of virtual stair walking while accounting for discrepancies between the user’s real body motion and the avatar’s motion, arising due to the virtual stairs present only in the virtual environment. Another design goal is to make DARAM applicable to dynamic multi-user environments. To this end, DARAM is designed to achieve motion remapping dynamically without requiring prior information about virtual stairs or environments, simplifying implementation in diverse VR applications. Furthermore, DARAM aims to synthesize avatar motion that delivers not only a realistic first-person experience but also a believable third-person experience for surrounding observers, making it applicable to multi-user VR applications. Two user studies demonstrate that the proposed technique successfully serves our design goals.","PeriodicalId":280036,"journal":{"name":"ACM SIGGRAPH 2023 Conference Proceedings","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127775806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Karlis Martins Briedis, Abdelaziz Djelouah, Raphael Ortiz, Mark Meyer, M. Gross, Christopher Schroers
{"title":"Kernel-Based Frame Interpolation for Spatio-Temporally Adaptive Rendering","authors":"Karlis Martins Briedis, Abdelaziz Djelouah, Raphael Ortiz, Mark Meyer, M. Gross, Christopher Schroers","doi":"10.1145/3588432.3591497","DOIUrl":"https://doi.org/10.1145/3588432.3591497","url":null,"abstract":"Recently, there has been exciting progress in frame interpolation for rendered content. In this offline rendering setting, additional inputs, such as albedo and depth, can be extracted from a scene at a very low cost and, when integrated in a suitable fashion, can significantly improve the quality of the interpolated frames. Although existing approaches have been able to show good results, most high-quality interpolation methods use a synthesis network for direct color prediction. In complex scenarios, this can result in unpredictable behavior and lead to color artifacts. To mitigate this and to increase robustness, we propose to estimate the interpolated frame by predicting spatially varying kernels that operate on image splats. Kernel prediction ensures a linear mapping from the input images to the output and enables new opportunities, such as consistent and efficient interpolation of alpha values or many other additional channels and render passes that might exist. Additionally, we present an adaptive strategy that allows predicting full or partial keyframes that should be rendered with color samples solely based on the auxiliary features of a shot. This content-based spatio-temporal adaptivity allows rendering significantly fewer color pixels as compared to a fixed-step scheme when wanting to maintain a certain quality. Overall, these contributions lead to a more robust method and significant further reductions of the rendering costs.","PeriodicalId":280036,"journal":{"name":"ACM SIGGRAPH 2023 Conference Proceedings","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125824108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Denoising-Aware Adaptive Sampling for Monte Carlo Ray Tracing","authors":"A. Firmino, J. Frisvad, H. Jensen","doi":"10.1145/3588432.3591537","DOIUrl":"https://doi.org/10.1145/3588432.3591537","url":null,"abstract":"Monte Carlo rendering is a computationally intensive task, but combined with recent deep-learning based advances in image denoising it is possible to achieve high quality images in a shorter amount of time. We present a novel adaptive sampling technique that further improves the efficiency of Monte Carlo rendering combined with deep-learning based denoising. Our proposed technique is general, can be combined with existing pre-trained denoisers, and, in contrast with previous techniques, does not itself require any additional neural networks or learning. A key contribution of our work is a general method for estimating the variance of the outputs of a neural network whose inputs are random variables. Our method iteratively renders additional samples and uses this novel variance estimate to compute the sample distribution for each subsequent iteration. Compared to uniform sampling and previous adaptive sampling techniques, our method achieves better equal-time error in all scenes tested, and when combined with a recent denoising post-correction technique, significantly faster error convergence is realized.","PeriodicalId":280036,"journal":{"name":"ACM SIGGRAPH 2023 Conference Proceedings","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133673389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MesoGen: Designing Procedural On-Surface Stranded Mesostructures","authors":"Élie Michel, T. Boubekeur","doi":"10.1145/3588432.3591496","DOIUrl":"https://doi.org/10.1145/3588432.3591496","url":null,"abstract":"Three-dimensional mesostructures enrich coarse macrosurfaces with complex features, which are 3D geometry with arbitrary topology in essence, but are expected to be self-similar with no tiling artifacts, just like texture-based material models. This is a challenging task, as no existing modeling tool provides the right constraints in the design phase to ensure such properties while maintaining real-time editing capabilities. In this paper, we propose MesoGen, a novel tile-centric authoring approach for the design of procedural mesostructures featuring non-periodic self-similarity while being represented as a compact and GPU-friendly model. We ensure by construction the continuity of the mesostructure: the user designs a set of atomic tiles by drawing 2D cross-sections on the interfaces between tiles, and selecting pairs of cross-sections to be connected as strands, i.e., 3D sweep surfaces. In parallel, a tiling engine continuously fills the shell space of the macrosurface with the so-defined tile set while ensuring that only matching interfaces are in contact. Moreover, the engine suggests to the user the addition of new tiles whenever the problem happens to be over-constrained. As a result, our method allows for the rapid creation of complex, seamless procedural mesostructure and is particularly adapted for wicker-like ones, often impossible to achieve with scattering-based mesostructure synthesis methods.","PeriodicalId":280036,"journal":{"name":"ACM SIGGRAPH 2023 Conference Proceedings","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133154786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"NeuSample: Importance Sampling for Neural Materials","authors":"Bing Xu, Liwen Wu, Miloš Hašan, Fujun Luan, Iliyan Georgiev, Zexiang Xu, R. Ramamoorthi","doi":"10.1145/3588432.3591524","DOIUrl":"https://doi.org/10.1145/3588432.3591524","url":null,"abstract":"Neural material representations have recently been proposed to augment the material appearance toolbox used in realistic rendering. These models are successful at tasks ranging from measured BTF compression, through efficient rendering of synthetic displaced materials with occlusions, to BSDF layering. However, importance sampling has been an after-thought in most neural material approaches, and has been handled by inefficient cosine-hemisphere sampling or mixing it with an additional simple analytic lobe. In this paper we fill that gap, by evaluating and comparing various pdf-learning approaches for sampling spatially varying neural materials, and proposing new variations of these approaches. We investigate three sampling approaches: analytic-lobe mixtures, normalizing flows, and histogram prediction. Within each type, we introduce improvements beyond previous work, and we extensively evaluate and compare these approaches in terms of sampling rate, wall-clock time, and final visual quality. Our versions of normalizing flows and histogram mixtures perform well and can be used in practical rendering systems, potentially facilitating the broader adoption of neural material models in production.","PeriodicalId":280036,"journal":{"name":"ACM SIGGRAPH 2023 Conference Proceedings","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131642732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haiwei Zhang, Jiqing Zhang, B. Dong, P. Peers, Wenwei Wu, Xiaopeng Wei, Felix Heide, Xin Yang
{"title":"In the Blink of an Eye: Event-based Emotion Recognition","authors":"Haiwei Zhang, Jiqing Zhang, B. Dong, P. Peers, Wenwei Wu, Xiaopeng Wei, Felix Heide, Xin Yang","doi":"10.1145/3588432.3591511","DOIUrl":"https://doi.org/10.1145/3588432.3591511","url":null,"abstract":"We introduce a wearable single-eye emotion recognition device and a real-time approach to recognizing emotions from partial observations of an emotion that is robust to changes in lighting conditions. At the heart of our method is a bio-inspired event-based camera setup and a newly designed lightweight Spiking Eye Emotion Network (SEEN). Compared to conventional cameras, event-based cameras offer a higher dynamic range (up to 140 dB vs. 80 dB) and a higher temporal resolution (in the order of μ s vs. 10s of ms). Thus, the captured events can encode rich temporal cues under challenging lighting conditions. However, these events lack texture information, posing problems in decoding temporal information effectively. SEEN tackles this issue from two different perspectives. First, we adopt convolutional spiking layers to take advantage of the spiking neural network’s ability to decode pertinent temporal information. Second, SEEN learns to extract essential spatial cues from corresponding intensity frames and leverages a novel weight-copy scheme to convey spatial attention to the convolutional spiking layers during training and inference. We extensively validate and demonstrate the effectiveness of our approach on a specially collected Single-eye Event-based Emotion (SEE) dataset. To the best of our knowledge, our method is the first eye-based emotion recognition method that leverages event-based cameras and spiking neural networks.","PeriodicalId":280036,"journal":{"name":"ACM SIGGRAPH 2023 Conference Proceedings","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122688425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Neural Parametric Mixtures for Path Guiding","authors":"Honghao Dong, Guoping Wang, Sheng Li","doi":"10.1145/3588432.3591533","DOIUrl":"https://doi.org/10.1145/3588432.3591533","url":null,"abstract":"Previous path guiding techniques typically rely on spatial subdivision structures to approximate directional target distributions, which may cause failure to capture spatio-directional correlations and introduce parallax issue. In this paper, we present Neural Parametric Mixtures (NPM), a neural formulation to encode target distributions for path guiding algorithms. We propose to use a continuous and compact neural implicit representation for encoding parametric models while decoding them via lightweight neural networks. We then derive a gradient-based optimization strategy to directly train the parameters of NPM with noisy Monte Carlo radiance estimates. Our approach efficiently models the target distribution (incident radiance or the product integrand) for path guiding, and outperforms previous guiding methods by capturing the spatio-directional correlations more accurately. Moreover, our approach is more training efficient and is practical for parallelization on modern GPUs.","PeriodicalId":280036,"journal":{"name":"ACM SIGGRAPH 2023 Conference Proceedings","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123594044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}