arXiv - CS - Graphics最新文献

筛选
英文 中文
GaussianHeads: End-to-End Learning of Drivable Gaussian Head Avatars from Coarse-to-fine Representations 高斯头像:从粗到细的表征中端到端学习可驾驶的高斯头像
arXiv - CS - Graphics Pub Date : 2024-09-18 DOI: arxiv-2409.11951
Kartik Teotia, Hyeongwoo Kim, Pablo Garrido, Marc Habermann, Mohamed Elgharib, Christian Theobalt
{"title":"GaussianHeads: End-to-End Learning of Drivable Gaussian Head Avatars from Coarse-to-fine Representations","authors":"Kartik Teotia, Hyeongwoo Kim, Pablo Garrido, Marc Habermann, Mohamed Elgharib, Christian Theobalt","doi":"arxiv-2409.11951","DOIUrl":"https://doi.org/arxiv-2409.11951","url":null,"abstract":"Real-time rendering of human head avatars is a cornerstone of many computer\u0000graphics applications, such as augmented reality, video games, and films, to\u0000name a few. Recent approaches address this challenge with computationally\u0000efficient geometry primitives in a carefully calibrated multi-view setup.\u0000Albeit producing photorealistic head renderings, it often fails to represent\u0000complex motion changes such as the mouth interior and strongly varying head\u0000poses. We propose a new method to generate highly dynamic and deformable human\u0000head avatars from multi-view imagery in real-time. At the core of our method is\u0000a hierarchical representation of head models that allows to capture the complex\u0000dynamics of facial expressions and head movements. First, with rich facial\u0000features extracted from raw input frames, we learn to deform the coarse facial\u0000geometry of the template mesh. We then initialize 3D Gaussians on the deformed\u0000surface and refine their positions in a fine step. We train this coarse-to-fine\u0000facial avatar model along with the head pose as a learnable parameter in an\u0000end-to-end framework. This enables not only controllable facial animation via\u0000video inputs, but also high-fidelity novel view synthesis of challenging facial\u0000expressions, such as tongue deformations and fine-grained teeth structure under\u0000large motion changes. Moreover, it encourages the learned head avatar to\u0000generalize towards new facial expressions and head poses at inference time. We\u0000demonstrate the performance of our method with comparisons against the related\u0000methods on different datasets, spanning challenging facial expression sequences\u0000across multiple identities. We also show the potential application of our\u0000approach by demonstrating a cross-identity facial performance transfer\u0000application.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"64 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Missing Data Imputation GAN for Character Sprite Generation 用于角色精灵生成的缺失数据推算广义运算模型
arXiv - CS - Graphics Pub Date : 2024-09-16 DOI: arxiv-2409.10721
Flávio Coutinho, Luiz Chaimowicz
{"title":"A Missing Data Imputation GAN for Character Sprite Generation","authors":"Flávio Coutinho, Luiz Chaimowicz","doi":"arxiv-2409.10721","DOIUrl":"https://doi.org/arxiv-2409.10721","url":null,"abstract":"Creating and updating pixel art character sprites with many frames spanning\u0000different animations and poses takes time and can quickly become repetitive.\u0000However, that can be partially automated to allow artists to focus on more\u0000creative tasks. In this work, we concentrate on creating pixel art character\u0000sprites in a target pose from images of them facing other three directions. We\u0000present a novel approach to character generation by framing the problem as a\u0000missing data imputation task. Our proposed generative adversarial networks\u0000model receives the images of a character in all available domains and produces\u0000the image of the missing pose. We evaluated our approach in the scenarios with\u0000one, two, and three missing images, achieving similar or better results to the\u0000state-of-the-art when more images are available. We also evaluate the impact of\u0000the proposed changes to the base architecture.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"101 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Phys3DGS: Physically-based 3D Gaussian Splatting for Inverse Rendering Phys3DGS:基于物理的三维高斯拼接反渲染技术
arXiv - CS - Graphics Pub Date : 2024-09-16 DOI: arxiv-2409.10335
Euntae Choi, Sungjoo Yoo
{"title":"Phys3DGS: Physically-based 3D Gaussian Splatting for Inverse Rendering","authors":"Euntae Choi, Sungjoo Yoo","doi":"arxiv-2409.10335","DOIUrl":"https://doi.org/arxiv-2409.10335","url":null,"abstract":"We propose two novel ideas (adoption of deferred rendering and mesh-based\u0000representation) to improve the quality of 3D Gaussian splatting (3DGS) based\u0000inverse rendering. We first report a problem incurred by hidden Gaussians,\u0000where Gaussians beneath the surface adversely affect the pixel color in the\u0000volume rendering adopted by the existing methods. In order to resolve the\u0000problem, we propose applying deferred rendering and report new problems\u0000incurred in a naive application of deferred rendering to the existing\u00003DGS-based inverse rendering. In an effort to improve the quality of 3DGS-based\u0000inverse rendering under deferred rendering, we propose a novel two-step\u0000training approach which (1) exploits mesh extraction and utilizes a hybrid\u0000mesh-3DGS representation and (2) applies novel regularization methods to better\u0000exploit the mesh. Our experiments show that, under relighting, the proposed\u0000method offers significantly better rendering quality than the existing\u00003DGS-based inverse rendering methods. Compared with the SOTA voxel grid-based\u0000inverse rendering method, it gives better rendering quality while offering\u0000real-time rendering.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Playground v3: Improving Text-to-Image Alignment with Deep-Fusion Large Language Models Playground v3:利用深度融合大型语言模型改进文本到图像的对齐方式
arXiv - CS - Graphics Pub Date : 2024-09-16 DOI: arxiv-2409.10695
Bingchen Liu, Ehsan Akhgari, Alexander Visheratin, Aleks Kamko, Linmiao Xu, Shivam Shrirao, Joao Souza, Suhail Doshi, Daiqing Li
{"title":"Playground v3: Improving Text-to-Image Alignment with Deep-Fusion Large Language Models","authors":"Bingchen Liu, Ehsan Akhgari, Alexander Visheratin, Aleks Kamko, Linmiao Xu, Shivam Shrirao, Joao Souza, Suhail Doshi, Daiqing Li","doi":"arxiv-2409.10695","DOIUrl":"https://doi.org/arxiv-2409.10695","url":null,"abstract":"We introduce Playground v3 (PGv3), our latest text-to-image model that\u0000achieves state-of-the-art (SoTA) performance across multiple testing\u0000benchmarks, excels in graphic design abilities and introduces new capabilities.\u0000Unlike traditional text-to-image generative models that rely on pre-trained\u0000language models like T5 or CLIP text encoders, our approach fully integrates\u0000Large Language Models (LLMs) with a novel structure that leverages text\u0000conditions exclusively from a decoder-only LLM. Additionally, to enhance image\u0000captioning quality-we developed an in-house captioner, capable of generating\u0000captions with varying levels of detail, enriching the diversity of text\u0000structures. We also introduce a new benchmark CapsBench to evaluate detailed\u0000image captioning performance. Experimental results demonstrate that PGv3 excels\u0000in text prompt adherence, complex reasoning, and accurate text rendering. User\u0000preference studies indicate the super-human graphic design ability of our model\u0000for common design applications, such as stickers, posters, and logo designs.\u0000Furthermore, PGv3 introduces new capabilities, including precise RGB color\u0000control and robust multilingual understanding.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"99 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Visualizing Temporal Topic Embeddings with a Compass 用指南针可视化时态主题嵌入
arXiv - CS - Graphics Pub Date : 2024-09-16 DOI: arxiv-2409.10649
Daniel Palamarchuk, Lemara Williams, Brian Mayer, Thomas Danielson, Rebecca Faust, Larry Deschaine, Chris North
{"title":"Visualizing Temporal Topic Embeddings with a Compass","authors":"Daniel Palamarchuk, Lemara Williams, Brian Mayer, Thomas Danielson, Rebecca Faust, Larry Deschaine, Chris North","doi":"arxiv-2409.10649","DOIUrl":"https://doi.org/arxiv-2409.10649","url":null,"abstract":"Dynamic topic modeling is useful at discovering the development and change in\u0000latent topics over time. However, present methodology relies on algorithms that\u0000separate document and word representations. This prevents the creation of a\u0000meaningful embedding space where changes in word usage and documents can be\u0000directly analyzed in a temporal context. This paper proposes an expansion of\u0000the compass-aligned temporal Word2Vec methodology into dynamic topic modeling.\u0000Such a method allows for the direct comparison of word and document embeddings\u0000across time in dynamic topics. This enables the creation of visualizations that\u0000incorporate temporal word embeddings within the context of documents into topic\u0000visualizations. In experiments against the current state-of-the-art, our\u0000proposed method demonstrates overall competitive performance in topic relevancy\u0000and diversity across temporal datasets of varying size. Simultaneously, it\u0000provides insightful visualizations focused on temporal word embeddings while\u0000maintaining the insights provided by global topic evolution, advancing our\u0000understanding of how topics evolve over time.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"20 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DrawingSpinUp: 3D Animation from Single Character Drawings DrawingSpinUp:从单个角色绘图制作三维动画
arXiv - CS - Graphics Pub Date : 2024-09-13 DOI: arxiv-2409.08615
Jie Zhou, Chufeng Xiao, Miu-Ling Lam, Hongbo Fu
{"title":"DrawingSpinUp: 3D Animation from Single Character Drawings","authors":"Jie Zhou, Chufeng Xiao, Miu-Ling Lam, Hongbo Fu","doi":"arxiv-2409.08615","DOIUrl":"https://doi.org/arxiv-2409.08615","url":null,"abstract":"Animating various character drawings is an engaging visual content creation\u0000task. Given a single character drawing, existing animation methods are limited\u0000to flat 2D motions and thus lack 3D effects. An alternative solution is to\u0000reconstruct a 3D model from a character drawing as a proxy and then retarget 3D\u0000motion data onto it. However, the existing image-to-3D methods could not work\u0000well for amateur character drawings in terms of appearance and geometry. We\u0000observe the contour lines, commonly existing in character drawings, would\u0000introduce significant ambiguity in texture synthesis due to their\u0000view-dependence. Additionally, thin regions represented by single-line contours\u0000are difficult to reconstruct (e.g., slim limbs of a stick figure) due to their\u0000delicate structures. To address these issues, we propose a novel system,\u0000DrawingSpinUp, to produce plausible 3D animations and breathe life into\u0000character drawings, allowing them to freely spin up, leap, and even perform a\u0000hip-hop dance. For appearance improvement, we adopt a removal-then-restoration\u0000strategy to first remove the view-dependent contour lines and then render them\u0000back after retargeting the reconstructed character. For geometry refinement, we\u0000develop a skeleton-based thinning deformation algorithm to refine the slim\u0000structures represented by the single-line contours. The experimental\u0000evaluations and a perceptual user study show that our proposed method\u0000outperforms the existing 2D and 3D animation methods and generates high-quality\u00003D animations from a single character drawing. Please refer to our project page\u0000(https://lordliang.github.io/DrawingSpinUp) for the code and generated\u0000animations.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"122 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Diffusion Approach to Radiance Field Relighting using Multi-Illumination Synthesis 利用多重照明合成进行辐射场再照明的扩散方法
arXiv - CS - Graphics Pub Date : 2024-09-13 DOI: arxiv-2409.08947
Yohan Poirier-Ginter, Alban Gauthier, Julien Phillip, Jean-Francois Lalonde, George Drettakis
{"title":"A Diffusion Approach to Radiance Field Relighting using Multi-Illumination Synthesis","authors":"Yohan Poirier-Ginter, Alban Gauthier, Julien Phillip, Jean-Francois Lalonde, George Drettakis","doi":"arxiv-2409.08947","DOIUrl":"https://doi.org/arxiv-2409.08947","url":null,"abstract":"Relighting radiance fields is severely underconstrained for multi-view data,\u0000which is most often captured under a single illumination condition; It is\u0000especially hard for full scenes containing multiple objects. We introduce a\u0000method to create relightable radiance fields using such single-illumination\u0000data by exploiting priors extracted from 2D image diffusion models. We first\u0000fine-tune a 2D diffusion model on a multi-illumination dataset conditioned by\u0000light direction, allowing us to augment a single-illumination capture into a\u0000realistic -- but possibly inconsistent -- multi-illumination dataset from\u0000directly defined light directions. We use this augmented data to create a\u0000relightable radiance field represented by 3D Gaussian splats. To allow direct\u0000control of light direction for low-frequency lighting, we represent appearance\u0000with a multi-layer perceptron parameterized on light direction. To enforce\u0000multi-view consistency and overcome inaccuracies we optimize a per-image\u0000auxiliary feature vector. We show results on synthetic and real multi-view data\u0000under single illumination, demonstrating that our method successfully exploits\u00002D diffusion model priors to allow realistic 3D relighting for complete scenes.\u0000Project site\u0000https://repo-sam.inria.fr/fungraph/generative-radiance-field-relighting/","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"105 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AdR-Gaussian: Accelerating Gaussian Splatting with Adaptive Radius AdR-Gaussian:利用自适应半径加速高斯拼接
arXiv - CS - Graphics Pub Date : 2024-09-13 DOI: arxiv-2409.08669
Xinzhe Wang, Ran Yi, Lizhuang Ma
{"title":"AdR-Gaussian: Accelerating Gaussian Splatting with Adaptive Radius","authors":"Xinzhe Wang, Ran Yi, Lizhuang Ma","doi":"arxiv-2409.08669","DOIUrl":"https://doi.org/arxiv-2409.08669","url":null,"abstract":"3D Gaussian Splatting (3DGS) is a recent explicit 3D representation that has\u0000achieved high-quality reconstruction and real-time rendering of complex scenes.\u0000However, the rasterization pipeline still suffers from unnecessary overhead\u0000resulting from avoidable serial Gaussian culling, and uneven load due to the\u0000distinct number of Gaussian to be rendered across pixels, which hinders wider\u0000promotion and application of 3DGS. In order to accelerate Gaussian splatting,\u0000we propose AdR-Gaussian, which moves part of serial culling in Render stage\u0000into the earlier Preprocess stage to enable parallel culling, employing\u0000adaptive radius to narrow the rendering pixel range for each Gaussian, and\u0000introduces a load balancing method to minimize thread waiting time during the\u0000pixel-parallel rendering. Our contributions are threefold, achieving a\u0000rendering speed of 310% while maintaining equivalent or even better quality\u0000than the state-of-the-art. Firstly, we propose to early cull Gaussian-Tile\u0000pairs of low splatting opacity based on an adaptive radius in the\u0000Gaussian-parallel Preprocess stage, which reduces the number of affected tile\u0000through the Gaussian bounding circle, thus reducing unnecessary overhead and\u0000achieving faster rendering speed. Secondly, we further propose early culling\u0000based on axis-aligned bounding box for Gaussian splatting, which achieves a\u0000more significant reduction in ineffective expenses by accurately calculating\u0000the Gaussian size in the 2D directions. Thirdly, we propose a balancing\u0000algorithm for pixel thread load, which compresses the information of heavy-load\u0000pixels to reduce thread waiting time, and enhance information of light-load\u0000pixels to hedge against rendering quality loss. Experiments on three datasets\u0000demonstrate that our algorithm can significantly improve the Gaussian Splatting\u0000rendering speed.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"33 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust Dual Gaussian Splatting for Immersive Human-centric Volumetric Videos 用于沉浸式以人为中心的体积测量视频的鲁棒双高斯拼接技术
arXiv - CS - Graphics Pub Date : 2024-09-12 DOI: arxiv-2409.08353
Yuheng Jiang, Zhehao Shen, Yu Hong, Chengcheng Guo, Yize Wu, Yingliang Zhang, Jingyi Yu, Lan Xu
{"title":"Robust Dual Gaussian Splatting for Immersive Human-centric Volumetric Videos","authors":"Yuheng Jiang, Zhehao Shen, Yu Hong, Chengcheng Guo, Yize Wu, Yingliang Zhang, Jingyi Yu, Lan Xu","doi":"arxiv-2409.08353","DOIUrl":"https://doi.org/arxiv-2409.08353","url":null,"abstract":"Volumetric video represents a transformative advancement in visual media,\u0000enabling users to freely navigate immersive virtual experiences and narrowing\u0000the gap between digital and real worlds. However, the need for extensive manual\u0000intervention to stabilize mesh sequences and the generation of excessively\u0000large assets in existing workflows impedes broader adoption. In this paper, we\u0000present a novel Gaussian-based approach, dubbed textit{DualGS}, for real-time\u0000and high-fidelity playback of complex human performance with excellent\u0000compression ratios. Our key idea in DualGS is to separately represent motion\u0000and appearance using the corresponding skin and joint Gaussians. Such an\u0000explicit disentanglement can significantly reduce motion redundancy and enhance\u0000temporal coherence. We begin by initializing the DualGS and anchoring skin\u0000Gaussians to joint Gaussians at the first frame. Subsequently, we employ a\u0000coarse-to-fine training strategy for frame-by-frame human performance modeling.\u0000It includes a coarse alignment phase for overall motion prediction as well as a\u0000fine-grained optimization for robust tracking and high-fidelity rendering. To\u0000integrate volumetric video seamlessly into VR environments, we efficiently\u0000compress motion using entropy encoding and appearance using codec compression\u0000coupled with a persistent codebook. Our approach achieves a compression ratio\u0000of up to 120 times, only requiring approximately 350KB of storage per frame. We\u0000demonstrate the efficacy of our representation through photo-realistic,\u0000free-view experiences on VR headsets, enabling users to immersively watch\u0000musicians in performance and feel the rhythm of the notes at the performers'\u0000fingertips.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"49 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Thermal3D-GS: Physics-induced 3D Gaussians for Thermal Infrared Novel-view Synthesis Thermal3D-GS:用于热红外新视角合成的物理诱导三维高斯模型
arXiv - CS - Graphics Pub Date : 2024-09-12 DOI: arxiv-2409.08042
Qian Chen, Shihao Shu, Xiangzhi Bai
{"title":"Thermal3D-GS: Physics-induced 3D Gaussians for Thermal Infrared Novel-view Synthesis","authors":"Qian Chen, Shihao Shu, Xiangzhi Bai","doi":"arxiv-2409.08042","DOIUrl":"https://doi.org/arxiv-2409.08042","url":null,"abstract":"Novel-view synthesis based on visible light has been extensively studied. In\u0000comparison to visible light imaging, thermal infrared imaging offers the\u0000advantage of all-weather imaging and strong penetration, providing increased\u0000possibilities for reconstruction in nighttime and adverse weather scenarios.\u0000However, thermal infrared imaging is influenced by physical characteristics\u0000such as atmospheric transmission effects and thermal conduction, hindering the\u0000precise reconstruction of intricate details in thermal infrared scenes,\u0000manifesting as issues of floaters and indistinct edge features in synthesized\u0000images. To address these limitations, this paper introduces a physics-induced\u00003D Gaussian splatting method named Thermal3D-GS. Thermal3D-GS begins by\u0000modeling atmospheric transmission effects and thermal conduction in\u0000three-dimensional media using neural networks. Additionally, a temperature\u0000consistency constraint is incorporated into the optimization objective to\u0000enhance the reconstruction accuracy of thermal infrared images. Furthermore, to\u0000validate the effectiveness of our method, the first large-scale benchmark\u0000dataset for this field named Thermal Infrared Novel-view Synthesis Dataset\u0000(TI-NSD) is created. This dataset comprises 20 authentic thermal infrared video\u0000scenes, covering indoor, outdoor, and UAV(Unmanned Aerial Vehicle) scenarios,\u0000totaling 6,664 frames of thermal infrared image data. Based on this dataset,\u0000this paper experimentally verifies the effectiveness of Thermal3D-GS. The\u0000results indicate that our method outperforms the baseline method with a 3.03 dB\u0000improvement in PSNR and significantly addresses the issues of floaters and\u0000indistinct edge features present in the baseline method. Our dataset and\u0000codebase will be released in\u0000href{https://github.com/mzzcdf/Thermal3DGS}{textcolor{red}{Thermal3DGS}}.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"60 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142221801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信