Raghav S K, Jahnavi A B, Vivek S D, Kirtan T S, P. Agarwal
{"title":"Detail-Preserving Video-based Virtual Try-On (DPV-VTON)","authors":"Raghav S K, Jahnavi A B, Vivek S D, Kirtan T S, P. Agarwal","doi":"10.1145/3599589.3599599","DOIUrl":null,"url":null,"abstract":"Virtual Try-on systems enable the try-on of a desired clothing on a target person image. These systems have led to vast research and have attracted commercial interest. However, the existing techniques are image-based systems limited to using an in-shop target clothing from a pre-defined dataset. To address this, we propose a video-based virtual try-on network DPV-VTON, that simulates the try-on using the target cloth extracted from the fashion videos on a target person image, while preserving the details and the characteristics. The core of the DPV-VTON pipeline is made up of (i) Best Frame Selection (BFS) module that extracts the best frame from the video (ii) Clothing Extraction module (CEM) extracts the target clothing from the selected best frame and generates a binary mask. (iii) A virtual try-on module synthesizes a final virtual try-on. Experiments on the existing benchmark datasets and a curated video dataset demonstrate that DPV-VTON generates photo-realistic and visually promising results. The proposed model obtains the lowest FID, LPIPS and the highest SSIM scores compared to the existing systems.","PeriodicalId":123753,"journal":{"name":"Proceedings of the 2023 8th International Conference on Multimedia and Image Processing","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2023 8th International Conference on Multimedia and Image Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3599589.3599599","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Virtual Try-on systems enable the try-on of a desired clothing on a target person image. These systems have led to vast research and have attracted commercial interest. However, the existing techniques are image-based systems limited to using an in-shop target clothing from a pre-defined dataset. To address this, we propose a video-based virtual try-on network DPV-VTON, that simulates the try-on using the target cloth extracted from the fashion videos on a target person image, while preserving the details and the characteristics. The core of the DPV-VTON pipeline is made up of (i) Best Frame Selection (BFS) module that extracts the best frame from the video (ii) Clothing Extraction module (CEM) extracts the target clothing from the selected best frame and generates a binary mask. (iii) A virtual try-on module synthesizes a final virtual try-on. Experiments on the existing benchmark datasets and a curated video dataset demonstrate that DPV-VTON generates photo-realistic and visually promising results. The proposed model obtains the lowest FID, LPIPS and the highest SSIM scores compared to the existing systems.