arXiv - CS - Graphics最新文献_第8页

Sampling for View Synthesis: From Local Light Field Fusion to Neural Radiance Fields and Beyond 视图合成采样：从局部光场融合到神经辐射场及其他

arXiv - CS - Graphics Pub Date : 2024-08-08 DOI: arxiv-2408.04586

Ravi Ramamoorthi

{"title":"Sampling for View Synthesis: From Local Light Field Fusion to Neural Radiance Fields and Beyond","authors":"Ravi Ramamoorthi","doi":"arxiv-2408.04586","DOIUrl":"https://doi.org/arxiv-2408.04586","url":null,"abstract":"Capturing and rendering novel views of complex real-world scenes is a\u0000long-standing problem in computer graphics and vision, with applications in\u0000augmented and virtual reality, immersive experiences and 3D photography. The\u0000advent of deep learning has enabled revolutionary advances in this area,\u0000classically known as image-based rendering. However, previous approaches\u0000require intractably dense view sampling or provide little or no guidance for\u0000how users should sample views of a scene to reliably render high-quality novel\u0000views. Local light field fusion proposes an algorithm for practical view\u0000synthesis from an irregular grid of sampled views that first expands each\u0000sampled view into a local light field via a multiplane image scene\u0000representation, then renders novel views by blending adjacent local light\u0000fields. Crucially, we extend traditional plenoptic sampling theory to derive a\u0000bound that specifies precisely how densely users should sample views of a given\u0000scene when using our algorithm. We achieve the perceptual quality of Nyquist\u0000rate view sampling while using up to 4000x fewer views. Subsequent developments\u0000have led to new scene representations for deep learning with view synthesis,\u0000notably neural radiance fields, but the problem of sparse view synthesis from a\u0000small number of images has only grown in importance. We reprise some of the\u0000recent results on sparse and even single image view synthesis, while posing the\u0000question of whether prescriptive sampling guidelines are feasible for the new\u0000generation of image-based rendering algorithms.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141932559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Sketch2Scene: Automatic Generation of Interactive 3D Game Scenes from User's Casual Sketches Sketch2Scene：从用户的休闲草图自动生成交互式 3D 游戏场景

arXiv - CS - Graphics Pub Date : 2024-08-08 DOI: arxiv-2408.04567

Yongzhi Xu, Yonhon Ng, Yifu Wang, Inkyu Sa, Yunfei Duan, Yang Li, Pan Ji, Hongdong Li

{"title":"Sketch2Scene: Automatic Generation of Interactive 3D Game Scenes from User's Casual Sketches","authors":"Yongzhi Xu, Yonhon Ng, Yifu Wang, Inkyu Sa, Yunfei Duan, Yang Li, Pan Ji, Hongdong Li","doi":"arxiv-2408.04567","DOIUrl":"https://doi.org/arxiv-2408.04567","url":null,"abstract":"3D Content Generation is at the heart of many computer graphics applications,\u0000including video gaming, film-making, virtual and augmented reality, etc. This\u0000paper proposes a novel deep-learning based approach for automatically\u0000generating interactive and playable 3D game scenes, all from the user's casual\u0000prompts such as a hand-drawn sketch. Sketch-based input offers a natural, and\u0000convenient way to convey the user's design intention in the content creation\u0000process. To circumvent the data-deficient challenge in learning (i.e. the lack\u0000of large training data of 3D scenes), our method leverages a pre-trained 2D\u0000denoising diffusion model to generate a 2D image of the scene as the conceptual\u0000guidance. In this process, we adopt the isometric projection mode to factor out\u0000unknown camera poses while obtaining the scene layout. From the generated\u0000isometric image, we use a pre-trained image understanding method to segment the\u0000image into meaningful parts, such as off-ground objects, trees, and buildings,\u0000and extract the 2D scene layout. These segments and layouts are subsequently\u0000fed into a procedural content generation (PCG) engine, such as a 3D video game\u0000engine like Unity or Unreal, to create the 3D scene. The resulting 3D scene can\u0000be seamlessly integrated into a game development environment and is readily\u0000playable. Extensive tests demonstrate that our method can efficiently generate\u0000high-quality and interactive 3D game scenes with layouts that closely follow\u0000the user's intention.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141932560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

One-Shot Method for Computing Generalized Winding Numbers 计算广义绕组数的一次性方法

arXiv - CS - Graphics Pub Date : 2024-08-08 DOI: arxiv-2408.04466

Cedric Martens, Mikhail Bessmeltsev

{"title":"One-Shot Method for Computing Generalized Winding Numbers","authors":"Cedric Martens, Mikhail Bessmeltsev","doi":"arxiv-2408.04466","DOIUrl":"https://doi.org/arxiv-2408.04466","url":null,"abstract":"The generalized winding number is an essential part of the geometry\u0000processing toolkit, allowing to quantify how much a given point is inside a\u0000surface, often represented by a mesh or a point cloud, even when the surface is\u0000open, noisy, or non-manifold. Parameterized surfaces, which often contain\u0000intentional and unintentional gaps and imprecisions, would also benefit from a\u0000generalized winding number. Standard methods to compute it, however, rely on a\u0000surface integral, challenging to compute without surface discretization,\u0000leading to loss of precision characteristic of parametric surfaces. We propose an alternative method to compute a generalized winding number,\u0000based only on the surface boundary and the intersections of a single ray with\u0000the surface. For parametric surfaces, we show that all the necessary operations\u0000can be done via a Sum-of-Squares (SOS) formulation, thus computing generalized\u0000winding numbers without surface discretization with machine precision. We show\u0000that by discretizing only the boundary of the surface, this becomes an\u0000efficient method. We demonstrate an application of our method to the problem of computing a\u0000generalized winding number of a surface represented by a curve network, where\u0000each curve loop is surfaced via Laplace equation. We use the Boundary Element\u0000Method to express the solution as a parametric surface, allowing us to apply\u0000our method without meshing the surfaces. As a bonus, we also demonstrate that\u0000for meshes with many triangles and a simple boundary, our method is faster than\u0000the hierarchical evaluation of the generalized winding number while still being\u0000precise. We validate our algorithms theoretically, numerically, and by demonstrating a\u0000gallery of results new{on a variety of parametric surfaces and meshes}, as\u0000well uses in a variety of applications, including voxelizations and boolean\u0000operations.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"14 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141968690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Automatic Skinning using the Mixed Finite Element Method 使用混合有限元法自动剥皮

arXiv - CS - Graphics Pub Date : 2024-08-07 DOI: arxiv-2408.04066

Hongcheng Song, Dmitry Kachkovski, Shaimaa Monem, Abraham Kassauhun Negash, David I. W. Levin

引用次数: 0

Fast Sprite Decomposition from Animated Graphics 从动画图形中快速分解精灵

arXiv - CS - Graphics Pub Date : 2024-08-07 DOI: arxiv-2408.03923

Tomoyuki Suzuki, Kotaro Kikuchi, Kota Yamaguchi

引用次数: 0

RayGauss: Volumetric Gaussian-Based Ray Casting for Photorealistic Novel View Synthesis RayGauss：基于体积高斯的光线铸造，实现逼真的新颖视图合成

arXiv - CS - Graphics Pub Date : 2024-08-06 DOI: arxiv-2408.03356

Hugo Blanc, Jean-Emmanuel Deschaud, Alexis Paljic

{"title":"RayGauss: Volumetric Gaussian-Based Ray Casting for Photorealistic Novel View Synthesis","authors":"Hugo Blanc, Jean-Emmanuel Deschaud, Alexis Paljic","doi":"arxiv-2408.03356","DOIUrl":"https://doi.org/arxiv-2408.03356","url":null,"abstract":"Differentiable volumetric rendering-based methods made significant progress\u0000in novel view synthesis. On one hand, innovative methods have replaced the\u0000Neural Radiance Fields (NeRF) network with locally parameterized structures,\u0000enabling high-quality renderings in a reasonable time. On the other hand,\u0000approaches have used differentiable splatting instead of NeRF's ray casting to\u0000optimize radiance fields rapidly using Gaussian kernels, allowing for fine\u0000adaptation to the scene. However, differentiable ray casting of irregularly\u0000spaced kernels has been scarcely explored, while splatting, despite enabling\u0000fast rendering times, is susceptible to clearly visible artifacts. Our work closes this gap by providing a physically consistent formulation of\u0000the emitted radiance c and density {sigma}, decomposed with Gaussian functions\u0000associated with Spherical Gaussians/Harmonics for all-frequency colorimetric\u0000representation. We also introduce a method enabling differentiable ray casting\u0000of irregularly distributed Gaussians using an algorithm that integrates\u0000radiance fields slab by slab and leverages a BVH structure. This allows our\u0000approach to finely adapt to the scene while avoiding splatting artifacts. As a\u0000result, we achieve superior rendering quality compared to the state-of-the-art\u0000while maintaining reasonable training times and achieving inference speeds of\u000025 FPS on the Blender dataset. Project page with videos and code:\u0000https://raygauss.github.io/","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"41 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141932611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An Object is Worth 64x64 Pixels: Generating 3D Object via Image Diffusion 一个物体价值 64x64 像素通过图像扩散生成 3D 物体

arXiv - CS - Graphics Pub Date : 2024-08-06 DOI: arxiv-2408.03178

Xingguang Yan, Han-Hung Lee, Ziyu Wan, Angel X. Chang

引用次数: 0

MGFs: Masked Gaussian Fields for Meshing Building based on Multi-View Images MGFs：基于多视图图像的用于网格构建的屏蔽高斯场

arXiv - CS - Graphics Pub Date : 2024-08-06 DOI: arxiv-2408.03060

Tengfei Wang, Zongqian Zhan, Rui Xia, Linxia Ji, Xin Wang

{"title":"MGFs: Masked Gaussian Fields for Meshing Building based on Multi-View Images","authors":"Tengfei Wang, Zongqian Zhan, Rui Xia, Linxia Ji, Xin Wang","doi":"arxiv-2408.03060","DOIUrl":"https://doi.org/arxiv-2408.03060","url":null,"abstract":"Over the last few decades, image-based building surface reconstruction has\u0000garnered substantial research interest and has been applied across various\u0000fields, such as heritage preservation, architectural planning, etc. Compared to\u0000the traditional photogrammetric and NeRF-based solutions, recently, Gaussian\u0000fields-based methods have exhibited significant potential in generating surface\u0000meshes due to their time-efficient training and detailed 3D information\u0000preservation. However, most gaussian fields-based methods are trained with all\u0000image pixels, encompassing building and nonbuilding areas, which results in a\u0000significant noise for building meshes and degeneration in time efficiency. This\u0000paper proposes a novel framework, Masked Gaussian Fields (MGFs), designed to\u0000generate accurate surface reconstruction for building in a time-efficient way.\u0000The framework first applies EfficientSAM and COLMAP to generate multi-level\u0000masks of building and the corresponding masked point clouds. Subsequently, the\u0000masked gaussian fields are trained by integrating two innovative losses: a\u0000multi-level perceptual masked loss focused on constructing building regions and\u0000a boundary loss aimed at enhancing the details of the boundaries between\u0000different masks. Finally, we improve the tetrahedral surface mesh extraction\u0000method based on the masked gaussian spheres. Comprehensive experiments on UAV\u0000images demonstrate that, compared to the traditional method and several\u0000NeRF-based and Gaussian-based SOTA solutions, our approach significantly\u0000improves both the accuracy and efficiency of building surface reconstruction.\u0000Notably, as a byproduct, there is an additional gain in the novel view\u0000synthesis of building.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"85 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141932607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Geometric Algebra Meets Large Language Models: Instruction-Based Transformations of Separate Meshes in 3D, Interactive and Controllable Scenes 几何代数与大型语言模型：三维、交互式和可控场景中基于指令的独立网格变换

arXiv - CS - Graphics Pub Date : 2024-08-05 DOI: arxiv-2408.02275

Dimitris Angelis, Prodromos Kolyvakis, Manos Kamarianakis, George Papagiannakis

{"title":"Geometric Algebra Meets Large Language Models: Instruction-Based Transformations of Separate Meshes in 3D, Interactive and Controllable Scenes","authors":"Dimitris Angelis, Prodromos Kolyvakis, Manos Kamarianakis, George Papagiannakis","doi":"arxiv-2408.02275","DOIUrl":"https://doi.org/arxiv-2408.02275","url":null,"abstract":"This paper introduces a novel integration of Large Language Models (LLMs)\u0000with Conformal Geometric Algebra (CGA) to revolutionize controllable 3D scene\u0000editing, particularly for object repositioning tasks, which traditionally\u0000requires intricate manual processes and specialized expertise. These\u0000conventional methods typically suffer from reliance on large training datasets\u0000or lack a formalized language for precise edits. Utilizing CGA as a robust\u0000formal language, our system, shenlong, precisely models spatial transformations\u0000necessary for accurate object repositioning. Leveraging the zero-shot learning\u0000capabilities of pre-trained LLMs, shenlong translates natural language\u0000instructions into CGA operations which are then applied to the scene,\u0000facilitating exact spatial transformations within 3D scenes without the need\u0000for specialized pre-training. Implemented in a realistic simulation\u0000environment, shenlong ensures compatibility with existing graphics pipelines.\u0000To accurately assess the impact of CGA, we benchmark against robust Euclidean\u0000Space baselines, evaluating both latency and accuracy. Comparative performance\u0000evaluations indicate that shenlong significantly reduces LLM response times by\u000016% and boosts success rates by 9.6% on average compared to the traditional\u0000methods. Notably, shenlong achieves a 100% perfect success rate in common\u0000practical queries, a benchmark where other systems fall short. These\u0000advancements underscore shenlong's potential to democratize 3D scene editing,\u0000enhancing accessibility and fostering innovation across sectors such as\u0000education, digital entertainment, and virtual reality.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"100 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141932608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SceneMotifCoder: Example-driven Visual Program Learning for Generating 3D Object Arrangements SceneMotifCoder：生成三维物体排列的示例驱动可视化程序学习

arXiv - CS - Graphics Pub Date : 2024-08-05 DOI: arxiv-2408.02211

Hou In Ivan Tam, Hou In Derek Pun, Austin T. Wang, Angel X. Chang, Manolis Savva

{"title":"SceneMotifCoder: Example-driven Visual Program Learning for Generating 3D Object Arrangements","authors":"Hou In Ivan Tam, Hou In Derek Pun, Austin T. Wang, Angel X. Chang, Manolis Savva","doi":"arxiv-2408.02211","DOIUrl":"https://doi.org/arxiv-2408.02211","url":null,"abstract":"Despite advances in text-to-3D generation methods, generation of multi-object\u0000arrangements remains challenging. Current methods exhibit failures in\u0000generating physically plausible arrangements that respect the provided text\u0000description. We present SceneMotifCoder (SMC), an example-driven framework for\u0000generating 3D object arrangements through visual program learning. SMC\u0000leverages large language models (LLMs) and program synthesis to overcome these\u0000challenges by learning visual programs from example arrangements. These\u0000programs are generalized into compact, editable meta-programs. When combined\u0000with 3D object retrieval and geometry-aware optimization, they can be used to\u0000create object arrangements varying in arrangement structure and contained\u0000objects. Our experiments show that SMC generates high-quality arrangements\u0000using meta-programs learned from few examples. Evaluation results demonstrates\u0000that object arrangements generated by SMC better conform to user-specified text\u0000descriptions and are more physically plausible when compared with\u0000state-of-the-art text-to-3D generation and layout methods.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"10 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141968744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0