PG-VTON：前后服装引导全景高斯虚拟试戴扩散建模

IF 1.7 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Animation and Virtual Worlds Pub Date : 2025-05-27 DOI:10.1002/cav.70054

Jian Zheng, Shengwei Sang, Yifei Lu, Guojun Dai, Xiaoyang Mao, Wenhui Zhou

{"title":"PG-VTON：前后服装引导全景高斯虚拟试戴扩散建模","authors":"Jian Zheng, Shengwei Sang, Yifei Lu, Guojun Dai, Xiaoyang Mao, Wenhui Zhou","doi":"10.1002/cav.70054","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Virtual try-on (VTON) technology enables the rapid creation of realistic try-on experiences, which makes it highly valuable for the metaverse and e-commerce. However, 2D VTON methods struggle to convey depth and immersion, while existing 3D methods require multi-view garment images and face challenges in generating high-fidelity garment textures. To address the aforementioned limitations, this paper proposes a panoramic Gaussian VTON framework guided solely by front-and-back garment information, named PG-VTON, which uses an adapted local controllable diffusion model for generating virtual dressing effects in specific regions. Specifically, PG-VTON adopts a coarse-to-fine architecture consisting of two stages. The coarse editing stage employs a local controllable diffusion model with a score distillation sampling (SDS) loss to generate coarse garment geometries with high-level semantics. Meanwhile, the refinement stage applies the same diffusion model with a photometric loss not only to enhance garment details and reduce artifacts but also to correct unwanted noise and distortions introduced during the coarse stage, thereby effectively enhancing realism. To improve training efficiency, we further introduce a dynamic noise scheduling (DNS) strategy, which ensures stable training and high-fidelity results. Experimental results demonstrate the superiority of our method, which achieves geometrically consistent and highly realistic 3D virtual try-on generation.</p>\n </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 3","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"PG-VTON: Front-And-Back Garment Guided Panoramic Gaussian Virtual Try-On With Diffusion Modeling\",\"authors\":\"Jian Zheng, Shengwei Sang, Yifei Lu, Guojun Dai, Xiaoyang Mao, Wenhui Zhou\",\"doi\":\"10.1002/cav.70054\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n <p>Virtual try-on (VTON) technology enables the rapid creation of realistic try-on experiences, which makes it highly valuable for the metaverse and e-commerce. However, 2D VTON methods struggle to convey depth and immersion, while existing 3D methods require multi-view garment images and face challenges in generating high-fidelity garment textures. To address the aforementioned limitations, this paper proposes a panoramic Gaussian VTON framework guided solely by front-and-back garment information, named PG-VTON, which uses an adapted local controllable diffusion model for generating virtual dressing effects in specific regions. Specifically, PG-VTON adopts a coarse-to-fine architecture consisting of two stages. The coarse editing stage employs a local controllable diffusion model with a score distillation sampling (SDS) loss to generate coarse garment geometries with high-level semantics. Meanwhile, the refinement stage applies the same diffusion model with a photometric loss not only to enhance garment details and reduce artifacts but also to correct unwanted noise and distortions introduced during the coarse stage, thereby effectively enhancing realism. To improve training efficiency, we further introduce a dynamic noise scheduling (DNS) strategy, which ensures stable training and high-fidelity results. Experimental results demonstrate the superiority of our method, which achieves geometrically consistent and highly realistic 3D virtual try-on generation.</p>\\n </div>\",\"PeriodicalId\":50645,\"journal\":{\"name\":\"Computer Animation and Virtual Worlds\",\"volume\":\"36 3\",\"pages\":\"\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2025-05-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Animation and Virtual Worlds\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/cav.70054\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Animation and Virtual Worlds","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cav.70054","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

摘要

虚拟试戴（VTON）技术可以快速创建逼真的试戴体验，这对虚拟世界和电子商务具有很高的价值。然而，2D VTON方法难以传达深度和沉浸感，而现有的3D方法需要多视角服装图像，并且在生成高保真服装纹理方面面临挑战。针对上述局限性，本文提出了一种仅以服装前后信息为指导的全景高斯VTON框架，命名为PG-VTON，该框架使用自适应的局部可控扩散模型在特定区域生成虚拟穿着效果。具体来说，PG-VTON采用了一个由两个阶段组成的从粗到精的架构。粗编辑阶段采用带有分数蒸馏采样（SDS）损失的局部可控扩散模型生成具有高级语义的粗服装几何图形。同时，精化阶段采用相同的扩散模型，增加了光度损失，不仅可以增强服装细节，减少伪影，还可以纠正粗化阶段引入的不必要的噪声和失真，从而有效地增强真实感。为了提高训练效率，我们进一步引入了动态噪声调度（DNS）策略，以保证稳定的训练和高保真的结果。实验结果证明了该方法的优越性，实现了几何一致性和高真实感的三维虚拟试戴生成。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

PG-VTON: Front-And-Back Garment Guided Panoramic Gaussian Virtual Try-On With Diffusion Modeling

Virtual try-on (VTON) technology enables the rapid creation of realistic try-on experiences, which makes it highly valuable for the metaverse and e-commerce. However, 2D VTON methods struggle to convey depth and immersion, while existing 3D methods require multi-view garment images and face challenges in generating high-fidelity garment textures. To address the aforementioned limitations, this paper proposes a panoramic Gaussian VTON framework guided solely by front-and-back garment information, named PG-VTON, which uses an adapted local controllable diffusion model for generating virtual dressing effects in specific regions. Specifically, PG-VTON adopts a coarse-to-fine architecture consisting of two stages. The coarse editing stage employs a local controllable diffusion model with a score distillation sampling (SDS) loss to generate coarse garment geometries with high-level semantics. Meanwhile, the refinement stage applies the same diffusion model with a photometric loss not only to enhance garment details and reduce artifacts but also to correct unwanted noise and distortions introduced during the coarse stage, thereby effectively enhancing realism. To improve training efficiency, we further introduce a dynamic noise scheduling (DNS) strategy, which ensures stable training and high-fidelity results. Experimental results demonstrate the superiority of our method, which achieves geometrically consistent and highly realistic 3D virtual try-on generation.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computer Animation and Virtual Worlds 工程技术-计算机：软件工程

CiteScore

2.20

自引率

0.00%

发文量

审稿时长

6-12 weeks

期刊介绍： With the advent of very powerful PCs and high-end graphics cards, there has been an incredible development in Virtual Worlds, real-time computer animation and simulation, games. But at the same time, new and cheaper Virtual Reality devices have appeared allowing an interaction with these real-time Virtual Worlds and even with real worlds through Augmented Reality. Three-dimensional characters, especially Virtual Humans are now of an exceptional quality, which allows to use them in the movie industry. But this is only a beginning, as with the development of Artificial Intelligence and Agent technology, these characters will become more and more autonomous and even intelligent. They will inhabit the Virtual Worlds in a Virtual Life together with animals and plants.