MVGaussian:利用多视图引导和表面致密化技术生成高保真文本到三维内容

Phu Pham, Aradhya N. Mathur, Ojaswa Sharma, Aniket Bera
{"title":"MVGaussian:利用多视图引导和表面致密化技术生成高保真文本到三维内容","authors":"Phu Pham, Aradhya N. Mathur, Ojaswa Sharma, Aniket Bera","doi":"arxiv-2409.06620","DOIUrl":null,"url":null,"abstract":"The field of text-to-3D content generation has made significant progress in\ngenerating realistic 3D objects, with existing methodologies like Score\nDistillation Sampling (SDS) offering promising guidance. However, these methods\noften encounter the \"Janus\" problem-multi-face ambiguities due to imprecise\nguidance. Additionally, while recent advancements in 3D gaussian splitting have\nshown its efficacy in representing 3D volumes, optimization of this\nrepresentation remains largely unexplored. This paper introduces a unified\nframework for text-to-3D content generation that addresses these critical gaps.\nOur approach utilizes multi-view guidance to iteratively form the structure of\nthe 3D model, progressively enhancing detail and accuracy. We also introduce a\nnovel densification algorithm that aligns gaussians close to the surface,\noptimizing the structural integrity and fidelity of the generated models.\nExtensive experiments validate our approach, demonstrating that it produces\nhigh-quality visual outputs with minimal time cost. Notably, our method\nachieves high-quality results within half an hour of training, offering a\nsubstantial efficiency gain over most existing methods, which require hours of\ntraining time to achieve comparable results.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MVGaussian: High-Fidelity text-to-3D Content Generation with Multi-View Guidance and Surface Densification\",\"authors\":\"Phu Pham, Aradhya N. Mathur, Ojaswa Sharma, Aniket Bera\",\"doi\":\"arxiv-2409.06620\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The field of text-to-3D content generation has made significant progress in\\ngenerating realistic 3D objects, with existing methodologies like Score\\nDistillation Sampling (SDS) offering promising guidance. However, these methods\\noften encounter the \\\"Janus\\\" problem-multi-face ambiguities due to imprecise\\nguidance. Additionally, while recent advancements in 3D gaussian splitting have\\nshown its efficacy in representing 3D volumes, optimization of this\\nrepresentation remains largely unexplored. This paper introduces a unified\\nframework for text-to-3D content generation that addresses these critical gaps.\\nOur approach utilizes multi-view guidance to iteratively form the structure of\\nthe 3D model, progressively enhancing detail and accuracy. We also introduce a\\nnovel densification algorithm that aligns gaussians close to the surface,\\noptimizing the structural integrity and fidelity of the generated models.\\nExtensive experiments validate our approach, demonstrating that it produces\\nhigh-quality visual outputs with minimal time cost. Notably, our method\\nachieves high-quality results within half an hour of training, offering a\\nsubstantial efficiency gain over most existing methods, which require hours of\\ntraining time to achieve comparable results.\",\"PeriodicalId\":501174,\"journal\":{\"name\":\"arXiv - CS - Graphics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Graphics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.06620\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Graphics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.06620","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

文本到三维内容生成领域在生成逼真的三维对象方面取得了重大进展,现有的方法(如分数蒸馏采样(SDS))提供了很好的指导。然而,这些方法经常会遇到 "Janus "问题--由于引导不精确而导致多面性模糊。此外,虽然三维高斯分割的最新进展显示了其在表示三维体积方面的功效,但这种表示方法的优化在很大程度上仍未得到探索。我们的方法利用多视角引导迭代形成三维模型结构,逐步增强细节和准确性。我们的方法利用多视角引导逐步形成三维模型的结构,逐步增强细节和准确性。我们还引入了一种高级致密化算法,可将高斯对齐到表面附近,优化生成模型的结构完整性和保真度。值得注意的是,我们的方法在半小时的训练时间内就能获得高质量的结果,与大多数现有方法相比,效率大幅提高,因为现有方法需要数小时的训练时间才能获得类似的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
MVGaussian: High-Fidelity text-to-3D Content Generation with Multi-View Guidance and Surface Densification
The field of text-to-3D content generation has made significant progress in generating realistic 3D objects, with existing methodologies like Score Distillation Sampling (SDS) offering promising guidance. However, these methods often encounter the "Janus" problem-multi-face ambiguities due to imprecise guidance. Additionally, while recent advancements in 3D gaussian splitting have shown its efficacy in representing 3D volumes, optimization of this representation remains largely unexplored. This paper introduces a unified framework for text-to-3D content generation that addresses these critical gaps. Our approach utilizes multi-view guidance to iteratively form the structure of the 3D model, progressively enhancing detail and accuracy. We also introduce a novel densification algorithm that aligns gaussians close to the surface, optimizing the structural integrity and fidelity of the generated models. Extensive experiments validate our approach, demonstrating that it produces high-quality visual outputs with minimal time cost. Notably, our method achieves high-quality results within half an hour of training, offering a substantial efficiency gain over most existing methods, which require hours of training time to achieve comparable results.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信