Huilin Liu, Qiong Fang, Caiping Xiang, Gaoming Yang
{"title":"DisenStyler: Text-driven fast image stylization using content disentanglement and style adaptive matching","authors":"Huilin Liu, Qiong Fang, Caiping Xiang, Gaoming Yang","doi":"10.1016/j.cag.2025.104275","DOIUrl":null,"url":null,"abstract":"<div><div>The emergence of the CLIP(Contrastive Language-Image Pre-Training) model has drawn widespread attention to text-driven image style transfer. However, existing methods are prone to content distortion when generating images and the transfer process is time-consuming. In this paper, we present DisenStyler, a novel Text-Driven Fast Image Stylization using Content Disentanglement and Style Adaptive Matching. The Global-Local Feature Disentanglement and Fusion (GLFDF) to fuse the content features extracted from the frequency and the spatial, enabling the detail information of the generated images can be well preserved. Furthermore, the Style Adaptive Matching Module (SAMM) is designed to map text features into the image space and conduct style adaptive matching by utilizing the means and variances of text and images. This not only significantly improves the speed of style transfer but also optimizes the local stylization effect of the generated images. Qualitative and quantitative experimental results show that the DisenStyler can better balance the content and style of the generated images while achieving fast image stylization.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"130 ","pages":"Article 104275"},"PeriodicalIF":2.5000,"publicationDate":"2025-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Graphics-Uk","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0097849325001165","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
The emergence of the CLIP(Contrastive Language-Image Pre-Training) model has drawn widespread attention to text-driven image style transfer. However, existing methods are prone to content distortion when generating images and the transfer process is time-consuming. In this paper, we present DisenStyler, a novel Text-Driven Fast Image Stylization using Content Disentanglement and Style Adaptive Matching. The Global-Local Feature Disentanglement and Fusion (GLFDF) to fuse the content features extracted from the frequency and the spatial, enabling the detail information of the generated images can be well preserved. Furthermore, the Style Adaptive Matching Module (SAMM) is designed to map text features into the image space and conduct style adaptive matching by utilizing the means and variances of text and images. This not only significantly improves the speed of style transfer but also optimizes the local stylization effect of the generated images. Qualitative and quantitative experimental results show that the DisenStyler can better balance the content and style of the generated images while achieving fast image stylization.
期刊介绍:
Computers & Graphics is dedicated to disseminate information on research and applications of computer graphics (CG) techniques. The journal encourages articles on:
1. Research and applications of interactive computer graphics. We are particularly interested in novel interaction techniques and applications of CG to problem domains.
2. State-of-the-art papers on late-breaking, cutting-edge research on CG.
3. Information on innovative uses of graphics principles and technologies.
4. Tutorial papers on both teaching CG principles and innovative uses of CG in education.