Chao Wang, Ana Serrano, Xingang Pan, Krzysztof Wolski, Bin Chen, K. Myszkowski, Hans-Peter Seidel, Christian Theobalt, Thomas Leimkühler
{"title":"An Implicit Neural Representation for the Image Stack: Depth, All in Focus, and High Dynamic Range","authors":"Chao Wang, Ana Serrano, Xingang Pan, Krzysztof Wolski, Bin Chen, K. Myszkowski, Hans-Peter Seidel, Christian Theobalt, Thomas Leimkühler","doi":"10.1145/3618367","DOIUrl":"https://doi.org/10.1145/3618367","url":null,"abstract":"In everyday photography, physical limitations of camera sensors and lenses frequently lead to a variety of degradations in captured images such as saturation or defocus blur. A common approach to overcome these limitations is to resort to image stack fusion, which involves capturing multiple images with different focal distances or exposures. For instance, to obtain an all-in-focus image, a set of multi-focus images is captured. Similarly, capturing multiple exposures allows for the reconstruction of high dynamic range. In this paper, we present a novel approach that combines neural fields with an expressive camera model to achieve a unified reconstruction of an all-in-focus high-dynamic-range image from an image stack. Our approach is composed of a set of specialized implicit neural representations tailored to address specific sub-problems along our pipeline: We use neural implicits to predict flow to overcome misalignments arising from lens breathing, depth, and all-in-focus images to account for depth of field, as well as tonemapping to deal with sensor responses and saturation - all trained using a physically inspired supervision structure with a differentiable thin lens model at its core. An important benefit of our approach is its ability to handle these tasks simultaneously or independently, providing flexible post-editing capabilities such as refocusing and exposure adjustment. By sampling the three primary factors in photography within our framework (focal distance, aperture, and exposure time), we conduct a thorough exploration to gain valuable insights into their significance and impact on overall reconstruction quality. Through extensive validation, we demonstrate that our method outperforms existing approaches in both depth-from-defocus and all-in-focus image reconstruction tasks. Moreover, our approach exhibits promising results in each of these three dimensions, showcasing its potential to enhance captured image quality and provide greater control in post-processing.","PeriodicalId":7077,"journal":{"name":"ACM Transactions on Graphics (TOG)","volume":"35 48","pages":"1 - 11"},"PeriodicalIF":0.0,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138601539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Computational Design of LEGO® Sketch Art","authors":"Mingjun Zhou, Jiahao Ge, Hao Xu, Chi-Wing Fu","doi":"10.1145/3618306","DOIUrl":"https://doi.org/10.1145/3618306","url":null,"abstract":"This paper presents computational methods to aid the creation of LEGO®1 sketch models from simple input images. Beyond conventional LEGO® mosaics, we aim to improve the expressiveness of LEGO® models by utilizing LEGO® tiles with sloping and rounding edges, together with rectangular bricks, to reproduce smooth curves and sharp features in the input. This is a challenging task, as we have limited brick shapes to use and limited space to place bricks. Also, the search space is immense and combinatorial in nature. We approach the task by decoupling the LEGO® construction into two steps: first approximate the shape with a LEGO®-buildable contour then filling the contour polygon with LEGO® bricks. Further, we formulate this contour approximation into a graph optimization with our objective and constraints and effectively solve for the contour polygon that best approximates the input shape. Further, we extend our optimization model to handle multi-color and multi-layer regions, and formulate a grid alignment process and various perceptual constraints to refine the results. We employ our method to create a large variety of LEGO® models and compare it with humans and baseline methods to manifest its compelling quality and speed.","PeriodicalId":7077,"journal":{"name":"ACM Transactions on Graphics (TOG)","volume":"45 12","pages":"1 - 15"},"PeriodicalIF":0.0,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138602372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"High Density Ratio Multi-Fluid Simulation with Peridynamics","authors":"Han Yan, Bo-Ning Ren","doi":"10.1145/3618347","DOIUrl":"https://doi.org/10.1145/3618347","url":null,"abstract":"Multiple fluid simulation has raised wide research interest in recent years. Despite the impressive successes of current works, simulation of scenes containing mixing or unmixing of high-density-ratio phases using particle-based discretizations still remains a challenging task. In this paper, we propose a peridynamic mixture-model theory that stably handles high-density-ratio multi-fluid simulations. With assistance of novel scalar-valued volume flow states, a particle based discretization scheme is proposed to calculate all the terms in the multi-phase Navier-Stokes equations in an integral form, We also design a novel mass updating strategy for enhancing phase mass conservation and reducing particle volume variations under high density ratio settings in multi-fluid simulations. As a result, we achieve significantly stabler simulations in mixture-model multi-fluid simulations involving mixing and unmixing of high density ratio phases. Various experiments and comparisons demonstrate the effectiveness of our approach.","PeriodicalId":7077,"journal":{"name":"ACM Transactions on Graphics (TOG)","volume":"5 9","pages":"1 - 14"},"PeriodicalIF":0.0,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138602967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaotong Wu, Wei-Sheng Lai, Yi-Chang Shih, Charles Herrmann, Michael Krainin, Deqing Sun, Chia-Kai Liang
{"title":"Efficient Hybrid Zoom Using Camera Fusion on Mobile Phones","authors":"Xiaotong Wu, Wei-Sheng Lai, Yi-Chang Shih, Charles Herrmann, Michael Krainin, Deqing Sun, Chia-Kai Liang","doi":"10.1145/3618362","DOIUrl":"https://doi.org/10.1145/3618362","url":null,"abstract":"DSLR cameras can achieve multiple zoom levels via shifting lens distances or swapping lens types. However, these techniques are not possible on smart-phone devices due to space constraints. Most smartphone manufacturers adopt a hybrid zoom system: commonly a Wide (W) camera at a low zoom level and a Telephoto (T) camera at a high zoom level. To simulate zoom levels between W and T, these systems crop and digitally upsample images from W, leading to significant detail loss. In this paper, we propose an efficient system for hybrid zoom super-resolution on mobile devices, which captures a synchronous pair of W and T shots and leverages machine learning models to align and transfer details from T to W. We further develop an adaptive blending method that accounts for depth-of-field mismatches, scene occlusion, flow uncertainty, and alignment errors. To minimize the domain gap, we design a dual-phone camera rig to capture real-world inputs and ground-truths for supervised training. Our method generates a 12-megapixel image in 500ms on a mobile platform and compares favorably against state-of-the-art methods under extensive evaluation on real-world scenarios.","PeriodicalId":7077,"journal":{"name":"ACM Transactions on Graphics (TOG)","volume":"6 6","pages":"1 - 12"},"PeriodicalIF":0.0,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138603950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Juan Montes Maestre, R. Hinchet, Stelian Coros, Bernhard Thomaszewski
{"title":"ToRoS: A Topology Optimization Approach for Designing Robotic Skins","authors":"Juan Montes Maestre, R. Hinchet, Stelian Coros, Bernhard Thomaszewski","doi":"10.1145/3618382","DOIUrl":"https://doi.org/10.1145/3618382","url":null,"abstract":"Soft robotics offers unique advantages in manipulating fragile or deformable objects, human-robot interaction, and exploring inaccessible terrain. However, designing soft robots that produce large, targeted deformations is challenging. In this paper, we propose a new methodology for designing soft robots that combines optimization-based design with a simple and cost-efficient manufacturing process. Our approach is centered around the concept of robotic skins---thin fabrics with 3D-printed reinforcement patterns that augment and control plain silicone actuators. By decoupling shape control and actuation, our approach enables a simpler and cost-efficient manufacturing process. Unlike previous methods that rely on empirical design heuristics for generating desired deformations, our approach automatically discovers complex reinforcement patterns without any need for domain knowledge or human intervention. This is achieved by casting reinforcement design as a nonlinear constrained optimization problem and using a novel, three-field topology optimization approach tailored to fabrics with 3D-printed reinforcements. We demonstrate the potential of our approach by designing soft robotic actuators capable of various motions such as bending, contraction, twist, and combinations thereof. We also demonstrate applications of our robotic skins to robotic grasping with a soft three-finger gripper and locomotion tasks for a soft quadrupedal robot.","PeriodicalId":7077,"journal":{"name":"ACM Transactions on Graphics (TOG)","volume":"59 17","pages":"1 - 11"},"PeriodicalIF":0.0,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138605017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Rectifying Strip Patterns","authors":"Bolun Wang, Hui Wang, E. Schling, Helmut Pottmann","doi":"10.1145/3618378","DOIUrl":"https://doi.org/10.1145/3618378","url":null,"abstract":"Straight flat strips of inextensible material can be bent into curved strips aligned with arbitrary space curves. The large shape variety of these so-called rectifying strips makes them candidates for shape modeling, especially in applications such as architecture where simple elements are preferred for the fabrication of complex shapes. In this paper, we provide computational tools for the design of shapes from rectifying strips. They can form various patterns and fulfill constraints which are required for specific applications such as gridshells or shading systems. The methodology is based on discrete models of rectifying strips, a discrete level-set formulation and optimization-based constrained mesh design and editing. We also analyse the geometry at nodes and present remarkable quadrilateral arrangements of rectifying strips with torsion-free nodes.","PeriodicalId":7077,"journal":{"name":"ACM Transactions on Graphics (TOG)","volume":"35 15","pages":"1 - 18"},"PeriodicalIF":0.0,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138601549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zheng Dong, Ke Xu, Yaoan Gao, Qilin Sun, Hujun Bao, Weiwei Xu, Rynson W. H. Lau
{"title":"SAILOR: Synergizing Radiance and Occupancy Fields for Live Human Performance Capture","authors":"Zheng Dong, Ke Xu, Yaoan Gao, Qilin Sun, Hujun Bao, Weiwei Xu, Rynson W. H. Lau","doi":"10.1145/3618370","DOIUrl":"https://doi.org/10.1145/3618370","url":null,"abstract":"Immersive user experiences in live VR/AR performances require a fast and accurate free-view rendering of the performers. Existing methods are mainly based on Pixel-aligned Implicit Functions (PIFu) or Neural Radiance Fields (NeRF). However, while PIFu-based methods usually fail to produce photorealistic view-dependent textures, NeRF-based methods typically lack local geometry accuracy and are computationally heavy (e.g., dense sampling of 3D points, additional fine-tuning, or pose estimation). In this work, we propose a novel generalizable method, named SAILOR, to create high-quality human free-view videos from very sparse RGBD live streams. To produce view-dependent textures while preserving locally accurate geometry, we integrate PIFu and NeRF such that they work synergistically by conditioning the PIFu on depth and then rendering view-dependent textures through NeRF. Specifically, we propose a novel network, named SRONet, for this hybrid representation. SRONet can handle unseen performers without fine-tuning. Besides, a neural blending-based ray interpolation approach, a tree-based voxel-denoising scheme, and a parallel computing pipeline are incorporated to reconstruct and render live free-view videos at 10 fps on average. To evaluate the rendering performance, we construct a real-captured RGBD benchmark from 40 performers. Experimental results show that SAILOR outperforms existing human reconstruction and performance capture methods.","PeriodicalId":7077,"journal":{"name":"ACM Transactions on Graphics (TOG)","volume":"12 13","pages":"1 - 15"},"PeriodicalIF":0.0,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138602047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reconstructing Close Human Interactions from Multiple Views","authors":"Qing Shuai, Zhiyuan Yu, Zhize Zhou, Lixin Fan, Haijun Yang, Can Yang, Xiaowei Zhou","doi":"10.1145/3618336","DOIUrl":"https://doi.org/10.1145/3618336","url":null,"abstract":"This paper addresses the challenging task of reconstructing the poses of multiple individuals engaged in close interactions, captured by multiple calibrated cameras. The difficulty arises from the noisy or false 2D keypoint detections due to inter-person occlusion, the heavy ambiguity in associating keypoints to individuals due to the close interactions, and the scarcity of training data as collecting and annotating motion data in crowded scenes is resource-intensive. We introduce a novel system to address these challenges. Our system integrates a learning-based pose estimation component and its corresponding training and inference strategies. The pose estimation component takes multi-view 2D keypoint heatmaps as input and reconstructs the pose of each individual using a 3D conditional volumetric network. As the network doesn't need images as input, we can leverage known camera parameters from test scenes and a large quantity of existing motion capture data to synthesize massive training data that mimics the real data distribution in test scenes. Extensive experiments demonstrate that our approach significantly surpasses previous approaches in terms of pose accuracy and is generalizable across various camera setups and population sizes. The code is available on our project page: https://github.com/zju3dv/CloseMoCap.","PeriodicalId":7077,"journal":{"name":"ACM Transactions on Graphics (TOG)","volume":"49 13","pages":"1 - 14"},"PeriodicalIF":0.0,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138602213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Quentin Becker, Seiichi Suzuki, Y. Ren, Davide Pellis, Julian Panetta, Mark Pauly
{"title":"C-Shells: Deployable Gridshells with Curved Beams","authors":"Quentin Becker, Seiichi Suzuki, Y. Ren, Davide Pellis, Julian Panetta, Mark Pauly","doi":"10.1145/3618366","DOIUrl":"https://doi.org/10.1145/3618366","url":null,"abstract":"We introduce a computational pipeline for simulating and designing C-shells, a new class of planar-to-spatial deployable linkage structures. A C-shell is composed of curved flexible beams connected at rotational joints that can be assembled in a stress-free planar configuration. When actuated, the elastic beams deform and the assembly deploys towards the target 3D shape. We propose two alternative computational design approaches for C-shells: (i) Forward exploration simulates the deployed shape from a planar beam layout provided by the user. Once a satisfactory overall shape is found, a subsequent design optimization adapts the beam geometry to reduce the elastic energy of the linkage while preserving the target shape. (ii) Inverse design is facilitated by a new geometric flattening method that takes a design surface as input and computes an initial layout of piecewise straight linkage beams. Our design optimization algorithm then calculates the smooth curved beams to best reproduce the target shape at minimal elastic energy. We find that C-shells offer a rich space for design and show several studies that highlight new shape topologies that cannot be achieved with existing deployable linkage structures.","PeriodicalId":7077,"journal":{"name":"ACM Transactions on Graphics (TOG)","volume":"40 8","pages":"1 - 17"},"PeriodicalIF":0.0,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138603967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"NodeGit: Diffing and Merging Node Graphs","authors":"Eduardo Rinaldi, D. Sforza, Fabio Pellacini","doi":"10.1145/3618343","DOIUrl":"https://doi.org/10.1145/3618343","url":null,"abstract":"The use of version control is pervasive in collaborative software projects. Version control systems are based on two primary operations: diffing two versions to compute the change between them and merging two versions edited concurrently. Recent works provide solutions to diff and merge graphics assets such as images, meshes and scenes. In this work, we present a practical algorithm to diff and merge procedural programs written as node graphs. To obtain more precise diffs, we version the graphs directly rather than their textual representations. Diffing graphs is equivalent to computing the graph edit distance, which is known to be computationally infeasible. Following prior work, we propose an approximate algorithm tailored to our problem domain. We validate the proposed algorithm by applying it both to manual edits and to a large set of randomized modifications of procedural shapes and materials. We compared our method with existing state-of-the-art algorithms, showing that our approach is the only one that reliably detects user edits.","PeriodicalId":7077,"journal":{"name":"ACM Transactions on Graphics (TOG)","volume":"73 1","pages":"1 - 12"},"PeriodicalIF":0.0,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138604630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}