基于多视图扩散模型的高斯溅射生成对象插入

IF 3.8 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Visual Informatics Pub Date : 2025-04-08 DOI:10.1016/j.visinf.2025.100238

Hongliang Zhong, Can Wang, Jingbo Zhang, Jing Liao

{"title":"基于多视图扩散模型的高斯溅射生成对象插入","authors":"Hongliang Zhong, Can Wang, Jingbo Zhang, Jing Liao","doi":"10.1016/j.visinf.2025.100238","DOIUrl":null,"url":null,"abstract":"<div><div>Generating and inserting new objects into 3D content is a compelling approach for achieving versatile scene recreation. Existing methods, which rely on SDS optimization or single-view inpainting, often struggle to produce high-quality results. To address this, we propose a novel method for object insertion in 3D content represented by Gaussian Splatting. Our approach introduces a multi-view diffusion model, dubbed MVInpainter, which is built upon a pre-trained stable video diffusion model to facilitate view-consistent object inpainting. Within MVInpainter, we incorporate a ControlNet-based conditional injection module to enable controlled and more predictable multi-view generation. After generating the multi-view inpainted results, we further propose a mask-aware 3D reconstruction technique to refine Gaussian Splatting reconstruction from these sparse inpainted views. By leveraging these fabricate techniques, our approach yields diverse results, ensures view-consistent and harmonious insertions, and produces better object quality. Extensive experiments demonstrate that our approach outperforms existing methods.</div></div>","PeriodicalId":36903,"journal":{"name":"Visual Informatics","volume":"9 2","pages":"Article 100238"},"PeriodicalIF":3.8000,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Generative object insertion in Gaussian splatting with a multi-view diffusion model\",\"authors\":\"Hongliang Zhong, Can Wang, Jingbo Zhang, Jing Liao\",\"doi\":\"10.1016/j.visinf.2025.100238\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Generating and inserting new objects into 3D content is a compelling approach for achieving versatile scene recreation. Existing methods, which rely on SDS optimization or single-view inpainting, often struggle to produce high-quality results. To address this, we propose a novel method for object insertion in 3D content represented by Gaussian Splatting. Our approach introduces a multi-view diffusion model, dubbed MVInpainter, which is built upon a pre-trained stable video diffusion model to facilitate view-consistent object inpainting. Within MVInpainter, we incorporate a ControlNet-based conditional injection module to enable controlled and more predictable multi-view generation. After generating the multi-view inpainted results, we further propose a mask-aware 3D reconstruction technique to refine Gaussian Splatting reconstruction from these sparse inpainted views. By leveraging these fabricate techniques, our approach yields diverse results, ensures view-consistent and harmonious insertions, and produces better object quality. Extensive experiments demonstrate that our approach outperforms existing methods.</div></div>\",\"PeriodicalId\":36903,\"journal\":{\"name\":\"Visual Informatics\",\"volume\":\"9 2\",\"pages\":\"Article 100238\"},\"PeriodicalIF\":3.8000,\"publicationDate\":\"2025-04-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Visual Informatics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2468502X2500021X\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Visual Informatics","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2468502X2500021X","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

在3D内容中生成和插入新对象是实现多用途场景再现的一种引人注目的方法。现有的方法依赖于SDS优化或单视图绘制，通常难以产生高质量的结果。为了解决这个问题，我们提出了一种用高斯溅射表示的3D内容插入对象的新方法。我们的方法引入了一个多视图扩散模型，称为MVInpainter，它建立在一个预训练的稳定视频扩散模型之上，以促进视图一致的对象在绘画中。在MVInpainter中，我们结合了一个基于controlnet的条件注入模块，以实现可控和更可预测的多视图生成。在生成多视图绘制结果后，我们进一步提出了一种基于蒙版感知的3D重建技术，以从这些稀疏的绘制视图中改进高斯飞溅重建。通过利用这些制造技术，我们的方法产生不同的结果，确保视图一致和和谐的插入，并产生更好的对象质量。大量的实验表明，我们的方法优于现有的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Generative object insertion in Gaussian splatting with a multi-view diffusion model

Generating and inserting new objects into 3D content is a compelling approach for achieving versatile scene recreation. Existing methods, which rely on SDS optimization or single-view inpainting, often struggle to produce high-quality results. To address this, we propose a novel method for object insertion in 3D content represented by Gaussian Splatting. Our approach introduces a multi-view diffusion model, dubbed MVInpainter, which is built upon a pre-trained stable video diffusion model to facilitate view-consistent object inpainting. Within MVInpainter, we incorporate a ControlNet-based conditional injection module to enable controlled and more predictable multi-view generation. After generating the multi-view inpainted results, we further propose a mask-aware 3D reconstruction technique to refine Gaussian Splatting reconstruction from these sparse inpainted views. By leveraging these fabricate techniques, our approach yields diverse results, ensures view-consistent and harmonious insertions, and produces better object quality. Extensive experiments demonstrate that our approach outperforms existing methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Visual Informatics Computer Science-Computer Graphics and Computer-Aided Design

CiteScore

6.70

自引率

3.30%

发文量

审稿时长

79 days