GECO: Fast Generative Image-to-3D within one SECOnd.

IF 6.5
Chen Wang, Jiatao Gu, Xiaoxiao Long, Yuan Liu, Lingjie Liu
{"title":"GECO: Fast Generative Image-to-3D within one SECOnd.","authors":"Chen Wang, Jiatao Gu, Xiaoxiao Long, Yuan Liu, Lingjie Liu","doi":"10.1109/TVCG.2025.3602405","DOIUrl":null,"url":null,"abstract":"<p><p>Recent advancements in single-image 3D generation have produced two main categories of methods: reconstruction-based and generative methods. Reconstruction-based methods are efficient but lack uncertainty handling, leading to blurry artifacts in unseen regions. Generative approaches that based on score distillation [47], [71] are slow due to scene-specific optimization. Other methods, like InstantMesh [76], use a two-stage process - generating multi-view images with a diffusion model and then reconstructing 3D - which is inefficient due to multiple denoising steps of the diffusion model. To overcome these limitations, we introduce GECO, a feed-forward method for fast and high-quality single-image-to-3D generation within one second on a single GPU. Our approach resolves uncertainty and inefficiency issues through a two-stage distillation process. In the first stage, we distill a multi-step diffusion model [56] into a one-step model using score distillation for single-image-to-multi-view synthesis. To mitigate the synthesis quality degradation caused by the one-step model, we introduce a second distillation stage to learn to predict high-quality 3D from imperfect multi-view generated images by performing distillation directly on 3D representations. Experiments demonstrate that GECO offers significant speed improvements and comparable reconstruction quality compared to prior two-stage methods. Code: https://cwchenwang.github.io/geco.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":6.5000,"publicationDate":"2025-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on visualization and computer graphics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TVCG.2025.3602405","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Recent advancements in single-image 3D generation have produced two main categories of methods: reconstruction-based and generative methods. Reconstruction-based methods are efficient but lack uncertainty handling, leading to blurry artifacts in unseen regions. Generative approaches that based on score distillation [47], [71] are slow due to scene-specific optimization. Other methods, like InstantMesh [76], use a two-stage process - generating multi-view images with a diffusion model and then reconstructing 3D - which is inefficient due to multiple denoising steps of the diffusion model. To overcome these limitations, we introduce GECO, a feed-forward method for fast and high-quality single-image-to-3D generation within one second on a single GPU. Our approach resolves uncertainty and inefficiency issues through a two-stage distillation process. In the first stage, we distill a multi-step diffusion model [56] into a one-step model using score distillation for single-image-to-multi-view synthesis. To mitigate the synthesis quality degradation caused by the one-step model, we introduce a second distillation stage to learn to predict high-quality 3D from imperfect multi-view generated images by performing distillation directly on 3D representations. Experiments demonstrate that GECO offers significant speed improvements and comparable reconstruction quality compared to prior two-stage methods. Code: https://cwchenwang.github.io/geco.

GECO:快速生成图像到3d在一秒钟内。
单图像3D生成的最新进展产生了两大类方法:基于重建的方法和生成方法。基于重建的方法是有效的,但缺乏不确定性处理,导致在看不见的区域产生模糊的伪影。基于分数蒸馏[47],[71]的生成方法由于场景特定优化而速度较慢。其他方法,如InstantMesh[76],使用两阶段过程-使用扩散模型生成多视图图像,然后重建3D -由于扩散模型的多个去噪步骤,效率低下。为了克服这些限制,我们引入了GECO,这是一种前馈方法,可以在单个GPU上在一秒钟内快速高质量地生成单图像到3d。我们的方法通过两阶段蒸馏过程解决了不确定性和低效率问题。在第一阶段,我们使用分数蒸馏将多步扩散模型[56]提取为单步模型,用于单图像到多视图的合成。为了减轻一步模型导致的合成质量下降,我们引入了第二蒸馏阶段,通过直接对3D表示进行蒸馏,从不完美的多视图生成的图像中学习预测高质量的3D。实验表明,与之前的两阶段方法相比,GECO具有显著的速度提高和相当的重建质量。代码:https://cwchenwang.github.io/geco。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信