Chen Wang, Jiatao Gu, Xiaoxiao Long, Yuan Liu, Lingjie Liu
{"title":"GECO:快速生成图像到3d在一秒钟内。","authors":"Chen Wang, Jiatao Gu, Xiaoxiao Long, Yuan Liu, Lingjie Liu","doi":"10.1109/TVCG.2025.3602405","DOIUrl":null,"url":null,"abstract":"<p><p>Recent advancements in single-image 3D generation have produced two main categories of methods: reconstruction-based and generative methods. Reconstruction-based methods are efficient but lack uncertainty handling, leading to blurry artifacts in unseen regions. Generative approaches that based on score distillation [47], [71] are slow due to scene-specific optimization. Other methods, like InstantMesh [76], use a two-stage process - generating multi-view images with a diffusion model and then reconstructing 3D - which is inefficient due to multiple denoising steps of the diffusion model. To overcome these limitations, we introduce GECO, a feed-forward method for fast and high-quality single-image-to-3D generation within one second on a single GPU. Our approach resolves uncertainty and inefficiency issues through a two-stage distillation process. In the first stage, we distill a multi-step diffusion model [56] into a one-step model using score distillation for single-image-to-multi-view synthesis. To mitigate the synthesis quality degradation caused by the one-step model, we introduce a second distillation stage to learn to predict high-quality 3D from imperfect multi-view generated images by performing distillation directly on 3D representations. Experiments demonstrate that GECO offers significant speed improvements and comparable reconstruction quality compared to prior two-stage methods. Code: https://cwchenwang.github.io/geco.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":6.5000,"publicationDate":"2025-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"GECO: Fast Generative Image-to-3D within one SECOnd.\",\"authors\":\"Chen Wang, Jiatao Gu, Xiaoxiao Long, Yuan Liu, Lingjie Liu\",\"doi\":\"10.1109/TVCG.2025.3602405\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Recent advancements in single-image 3D generation have produced two main categories of methods: reconstruction-based and generative methods. Reconstruction-based methods are efficient but lack uncertainty handling, leading to blurry artifacts in unseen regions. Generative approaches that based on score distillation [47], [71] are slow due to scene-specific optimization. Other methods, like InstantMesh [76], use a two-stage process - generating multi-view images with a diffusion model and then reconstructing 3D - which is inefficient due to multiple denoising steps of the diffusion model. To overcome these limitations, we introduce GECO, a feed-forward method for fast and high-quality single-image-to-3D generation within one second on a single GPU. Our approach resolves uncertainty and inefficiency issues through a two-stage distillation process. In the first stage, we distill a multi-step diffusion model [56] into a one-step model using score distillation for single-image-to-multi-view synthesis. To mitigate the synthesis quality degradation caused by the one-step model, we introduce a second distillation stage to learn to predict high-quality 3D from imperfect multi-view generated images by performing distillation directly on 3D representations. Experiments demonstrate that GECO offers significant speed improvements and comparable reconstruction quality compared to prior two-stage methods. Code: https://cwchenwang.github.io/geco.</p>\",\"PeriodicalId\":94035,\"journal\":{\"name\":\"IEEE transactions on visualization and computer graphics\",\"volume\":\"PP \",\"pages\":\"\"},\"PeriodicalIF\":6.5000,\"publicationDate\":\"2025-08-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on visualization and computer graphics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TVCG.2025.3602405\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on visualization and computer graphics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TVCG.2025.3602405","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
GECO: Fast Generative Image-to-3D within one SECOnd.
Recent advancements in single-image 3D generation have produced two main categories of methods: reconstruction-based and generative methods. Reconstruction-based methods are efficient but lack uncertainty handling, leading to blurry artifacts in unseen regions. Generative approaches that based on score distillation [47], [71] are slow due to scene-specific optimization. Other methods, like InstantMesh [76], use a two-stage process - generating multi-view images with a diffusion model and then reconstructing 3D - which is inefficient due to multiple denoising steps of the diffusion model. To overcome these limitations, we introduce GECO, a feed-forward method for fast and high-quality single-image-to-3D generation within one second on a single GPU. Our approach resolves uncertainty and inefficiency issues through a two-stage distillation process. In the first stage, we distill a multi-step diffusion model [56] into a one-step model using score distillation for single-image-to-multi-view synthesis. To mitigate the synthesis quality degradation caused by the one-step model, we introduce a second distillation stage to learn to predict high-quality 3D from imperfect multi-view generated images by performing distillation directly on 3D representations. Experiments demonstrate that GECO offers significant speed improvements and comparable reconstruction quality compared to prior two-stage methods. Code: https://cwchenwang.github.io/geco.