Qiuhong Shen, Xingyi Yang, Michael Bi Mi, Xinchao Wang
{"title":"Vista3D: Unravel the 3D Darkside of a Single Image","authors":"Qiuhong Shen, Xingyi Yang, Michael Bi Mi, Xinchao Wang","doi":"arxiv-2409.12193","DOIUrl":null,"url":null,"abstract":"We embark on the age-old quest: unveiling the hidden dimensions of objects\nfrom mere glimpses of their visible parts. To address this, we present Vista3D,\na framework that realizes swift and consistent 3D generation within a mere 5\nminutes. At the heart of Vista3D lies a two-phase approach: the coarse phase\nand the fine phase. In the coarse phase, we rapidly generate initial geometry\nwith Gaussian Splatting from a single image. In the fine phase, we extract a\nSigned Distance Function (SDF) directly from learned Gaussian Splatting,\noptimizing it with a differentiable isosurface representation. Furthermore, it\nelevates the quality of generation by using a disentangled representation with\ntwo independent implicit functions to capture both visible and obscured aspects\nof objects. Additionally, it harmonizes gradients from 2D diffusion prior with\n3D-aware diffusion priors by angular diffusion prior composition. Through\nextensive evaluation, we demonstrate that Vista3D effectively sustains a\nbalance between the consistency and diversity of the generated 3D objects.\nDemos and code will be available at https://github.com/florinshen/Vista3D.","PeriodicalId":501480,"journal":{"name":"arXiv - CS - Multimedia","volume":"30 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.12193","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We embark on the age-old quest: unveiling the hidden dimensions of objects
from mere glimpses of their visible parts. To address this, we present Vista3D,
a framework that realizes swift and consistent 3D generation within a mere 5
minutes. At the heart of Vista3D lies a two-phase approach: the coarse phase
and the fine phase. In the coarse phase, we rapidly generate initial geometry
with Gaussian Splatting from a single image. In the fine phase, we extract a
Signed Distance Function (SDF) directly from learned Gaussian Splatting,
optimizing it with a differentiable isosurface representation. Furthermore, it
elevates the quality of generation by using a disentangled representation with
two independent implicit functions to capture both visible and obscured aspects
of objects. Additionally, it harmonizes gradients from 2D diffusion prior with
3D-aware diffusion priors by angular diffusion prior composition. Through
extensive evaluation, we demonstrate that Vista3D effectively sustains a
balance between the consistency and diversity of the generated 3D objects.
Demos and code will be available at https://github.com/florinshen/Vista3D.