GECO: Fast Generative Image-to-3D within one SECOnd.

IF 6.5

IEEE transactions on visualization and computer graphics Pub Date : 2025-08-25 DOI:10.1109/TVCG.2025.3602405

Chen Wang, Jiatao Gu, Xiaoxiao Long, Yuan Liu, Lingjie Liu

{"title":"GECO: Fast Generative Image-to-3D within one SECOnd.","authors":"Chen Wang, Jiatao Gu, Xiaoxiao Long, Yuan Liu, Lingjie Liu","doi":"10.1109/TVCG.2025.3602405","DOIUrl":null,"url":null,"abstract":"<p><p>Recent advancements in single-image 3D generation have produced two main categories of methods: reconstruction-based and generative methods. Reconstruction-based methods are efficient but lack uncertainty handling, leading to blurry artifacts in unseen regions. Generative approaches that based on score distillation [47], [71] are slow due to scene-specific optimization. Other methods, like InstantMesh [76], use a two-stage process - generating multi-view images with a diffusion model and then reconstructing 3D - which is inefficient due to multiple denoising steps of the diffusion model. To overcome these limitations, we introduce GECO, a feed-forward method for fast and high-quality single-image-to-3D generation within one second on a single GPU. Our approach resolves uncertainty and inefficiency issues through a two-stage distillation process. In the first stage, we distill a multi-step diffusion model [56] into a one-step model using score distillation for single-image-to-multi-view synthesis. To mitigate the synthesis quality degradation caused by the one-step model, we introduce a second distillation stage to learn to predict high-quality 3D from imperfect multi-view generated images by performing distillation directly on 3D representations. Experiments demonstrate that GECO offers significant speed improvements and comparable reconstruction quality compared to prior two-stage methods. Code: https://cwchenwang.github.io/geco.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":6.5000,"publicationDate":"2025-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on visualization and computer graphics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TVCG.2025.3602405","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Recent advancements in single-image 3D generation have produced two main categories of methods: reconstruction-based and generative methods. Reconstruction-based methods are efficient but lack uncertainty handling, leading to blurry artifacts in unseen regions. Generative approaches that based on score distillation [47], [71] are slow due to scene-specific optimization. Other methods, like InstantMesh [76], use a two-stage process - generating multi-view images with a diffusion model and then reconstructing 3D - which is inefficient due to multiple denoising steps of the diffusion model. To overcome these limitations, we introduce GECO, a feed-forward method for fast and high-quality single-image-to-3D generation within one second on a single GPU. Our approach resolves uncertainty and inefficiency issues through a two-stage distillation process. In the first stage, we distill a multi-step diffusion model [56] into a one-step model using score distillation for single-image-to-multi-view synthesis. To mitigate the synthesis quality degradation caused by the one-step model, we introduce a second distillation stage to learn to predict high-quality 3D from imperfect multi-view generated images by performing distillation directly on 3D representations. Experiments demonstrate that GECO offers significant speed improvements and comparable reconstruction quality compared to prior two-stage methods. Code: https://cwchenwang.github.io/geco.

查看原文本刊更多论文

GECO：快速生成图像到3d在一秒钟内。

单图像3D生成的最新进展产生了两大类方法：基于重建的方法和生成方法。基于重建的方法是有效的，但缺乏不确定性处理，导致在看不见的区域产生模糊的伪影。基于分数蒸馏[47]，[71]的生成方法由于场景特定优化而速度较慢。其他方法，如InstantMesh[76]，使用两阶段过程-使用扩散模型生成多视图图像，然后重建3D -由于扩散模型的多个去噪步骤，效率低下。为了克服这些限制，我们引入了GECO，这是一种前馈方法，可以在单个GPU上在一秒钟内快速高质量地生成单图像到3d。我们的方法通过两阶段蒸馏过程解决了不确定性和低效率问题。在第一阶段，我们使用分数蒸馏将多步扩散模型[56]提取为单步模型，用于单图像到多视图的合成。为了减轻一步模型导致的合成质量下降，我们引入了第二蒸馏阶段，通过直接对3D表示进行蒸馏，从不完美的多视图生成的图像中学习预测高质量的3D。实验表明，与之前的两阶段方法相比，GECO具有显著的速度提高和相当的重建质量。代码:https://cwchenwang.github.io/geco。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on visualization and computer graphics

自引率

0.00%

发文量