{"title":"PG-VTON:前后服装引导全景高斯虚拟试戴扩散建模","authors":"Jian Zheng, Shengwei Sang, Yifei Lu, Guojun Dai, Xiaoyang Mao, Wenhui Zhou","doi":"10.1002/cav.70054","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Virtual try-on (VTON) technology enables the rapid creation of realistic try-on experiences, which makes it highly valuable for the metaverse and e-commerce. However, 2D VTON methods struggle to convey depth and immersion, while existing 3D methods require multi-view garment images and face challenges in generating high-fidelity garment textures. To address the aforementioned limitations, this paper proposes a panoramic Gaussian VTON framework guided solely by front-and-back garment information, named PG-VTON, which uses an adapted local controllable diffusion model for generating virtual dressing effects in specific regions. Specifically, PG-VTON adopts a coarse-to-fine architecture consisting of two stages. The coarse editing stage employs a local controllable diffusion model with a score distillation sampling (SDS) loss to generate coarse garment geometries with high-level semantics. Meanwhile, the refinement stage applies the same diffusion model with a photometric loss not only to enhance garment details and reduce artifacts but also to correct unwanted noise and distortions introduced during the coarse stage, thereby effectively enhancing realism. To improve training efficiency, we further introduce a dynamic noise scheduling (DNS) strategy, which ensures stable training and high-fidelity results. Experimental results demonstrate the superiority of our method, which achieves geometrically consistent and highly realistic 3D virtual try-on generation.</p>\n </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 3","pages":""},"PeriodicalIF":0.9000,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"PG-VTON: Front-And-Back Garment Guided Panoramic Gaussian Virtual Try-On With Diffusion Modeling\",\"authors\":\"Jian Zheng, Shengwei Sang, Yifei Lu, Guojun Dai, Xiaoyang Mao, Wenhui Zhou\",\"doi\":\"10.1002/cav.70054\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n <p>Virtual try-on (VTON) technology enables the rapid creation of realistic try-on experiences, which makes it highly valuable for the metaverse and e-commerce. However, 2D VTON methods struggle to convey depth and immersion, while existing 3D methods require multi-view garment images and face challenges in generating high-fidelity garment textures. To address the aforementioned limitations, this paper proposes a panoramic Gaussian VTON framework guided solely by front-and-back garment information, named PG-VTON, which uses an adapted local controllable diffusion model for generating virtual dressing effects in specific regions. Specifically, PG-VTON adopts a coarse-to-fine architecture consisting of two stages. The coarse editing stage employs a local controllable diffusion model with a score distillation sampling (SDS) loss to generate coarse garment geometries with high-level semantics. Meanwhile, the refinement stage applies the same diffusion model with a photometric loss not only to enhance garment details and reduce artifacts but also to correct unwanted noise and distortions introduced during the coarse stage, thereby effectively enhancing realism. To improve training efficiency, we further introduce a dynamic noise scheduling (DNS) strategy, which ensures stable training and high-fidelity results. Experimental results demonstrate the superiority of our method, which achieves geometrically consistent and highly realistic 3D virtual try-on generation.</p>\\n </div>\",\"PeriodicalId\":50645,\"journal\":{\"name\":\"Computer Animation and Virtual Worlds\",\"volume\":\"36 3\",\"pages\":\"\"},\"PeriodicalIF\":0.9000,\"publicationDate\":\"2025-05-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Animation and Virtual Worlds\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/cav.70054\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Animation and Virtual Worlds","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cav.70054","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
Virtual try-on (VTON) technology enables the rapid creation of realistic try-on experiences, which makes it highly valuable for the metaverse and e-commerce. However, 2D VTON methods struggle to convey depth and immersion, while existing 3D methods require multi-view garment images and face challenges in generating high-fidelity garment textures. To address the aforementioned limitations, this paper proposes a panoramic Gaussian VTON framework guided solely by front-and-back garment information, named PG-VTON, which uses an adapted local controllable diffusion model for generating virtual dressing effects in specific regions. Specifically, PG-VTON adopts a coarse-to-fine architecture consisting of two stages. The coarse editing stage employs a local controllable diffusion model with a score distillation sampling (SDS) loss to generate coarse garment geometries with high-level semantics. Meanwhile, the refinement stage applies the same diffusion model with a photometric loss not only to enhance garment details and reduce artifacts but also to correct unwanted noise and distortions introduced during the coarse stage, thereby effectively enhancing realism. To improve training efficiency, we further introduce a dynamic noise scheduling (DNS) strategy, which ensures stable training and high-fidelity results. Experimental results demonstrate the superiority of our method, which achieves geometrically consistent and highly realistic 3D virtual try-on generation.
期刊介绍:
With the advent of very powerful PCs and high-end graphics cards, there has been an incredible development in Virtual Worlds, real-time computer animation and simulation, games. But at the same time, new and cheaper Virtual Reality devices have appeared allowing an interaction with these real-time Virtual Worlds and even with real worlds through Augmented Reality. Three-dimensional characters, especially Virtual Humans are now of an exceptional quality, which allows to use them in the movie industry. But this is only a beginning, as with the development of Artificial Intelligence and Agent technology, these characters will become more and more autonomous and even intelligent. They will inhabit the Virtual Worlds in a Virtual Life together with animals and plants.