Huilin Liu, Qiong Fang, Caiping Xiang, Gaoming Yang
{"title":"DisenStyler:使用内容分解和样式自适应匹配的文本驱动的快速图像样式化","authors":"Huilin Liu, Qiong Fang, Caiping Xiang, Gaoming Yang","doi":"10.1016/j.cag.2025.104275","DOIUrl":null,"url":null,"abstract":"<div><div>The emergence of the CLIP(Contrastive Language-Image Pre-Training) model has drawn widespread attention to text-driven image style transfer. However, existing methods are prone to content distortion when generating images and the transfer process is time-consuming. In this paper, we present DisenStyler, a novel Text-Driven Fast Image Stylization using Content Disentanglement and Style Adaptive Matching. The Global-Local Feature Disentanglement and Fusion (GLFDF) to fuse the content features extracted from the frequency and the spatial, enabling the detail information of the generated images can be well preserved. Furthermore, the Style Adaptive Matching Module (SAMM) is designed to map text features into the image space and conduct style adaptive matching by utilizing the means and variances of text and images. This not only significantly improves the speed of style transfer but also optimizes the local stylization effect of the generated images. Qualitative and quantitative experimental results show that the DisenStyler can better balance the content and style of the generated images while achieving fast image stylization.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"130 ","pages":"Article 104275"},"PeriodicalIF":2.5000,"publicationDate":"2025-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DisenStyler: Text-driven fast image stylization using content disentanglement and style adaptive matching\",\"authors\":\"Huilin Liu, Qiong Fang, Caiping Xiang, Gaoming Yang\",\"doi\":\"10.1016/j.cag.2025.104275\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The emergence of the CLIP(Contrastive Language-Image Pre-Training) model has drawn widespread attention to text-driven image style transfer. However, existing methods are prone to content distortion when generating images and the transfer process is time-consuming. In this paper, we present DisenStyler, a novel Text-Driven Fast Image Stylization using Content Disentanglement and Style Adaptive Matching. The Global-Local Feature Disentanglement and Fusion (GLFDF) to fuse the content features extracted from the frequency and the spatial, enabling the detail information of the generated images can be well preserved. Furthermore, the Style Adaptive Matching Module (SAMM) is designed to map text features into the image space and conduct style adaptive matching by utilizing the means and variances of text and images. This not only significantly improves the speed of style transfer but also optimizes the local stylization effect of the generated images. Qualitative and quantitative experimental results show that the DisenStyler can better balance the content and style of the generated images while achieving fast image stylization.</div></div>\",\"PeriodicalId\":50628,\"journal\":{\"name\":\"Computers & Graphics-Uk\",\"volume\":\"130 \",\"pages\":\"Article 104275\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2025-06-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Graphics-Uk\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0097849325001165\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Graphics-Uk","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0097849325001165","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
摘要
CLIP(对比语言-图像预训练)模型的出现引起了人们对文本驱动的图像风格迁移的广泛关注。然而,现有的方法在生成图像时容易产生内容失真,传输过程耗时长。在本文中,我们提出了DisenStyler,一种新的文本驱动的快速图像样式化方法,使用内容解纠缠和样式自适应匹配。Global-Local Feature Disentanglement and Fusion (GLFDF)将从频率和空间中提取的内容特征融合在一起,使生成的图像能够很好地保留细节信息。设计风格自适应匹配模块(SAMM),将文本特征映射到图像空间中,利用文本和图像的均值和方差进行风格自适应匹配。这不仅显著提高了风格传递的速度,而且优化了生成图像的局部风格化效果。定性和定量实验结果表明,DisenStyler能够更好地平衡生成图像的内容和风格,同时实现快速的图像风格化。
DisenStyler: Text-driven fast image stylization using content disentanglement and style adaptive matching
The emergence of the CLIP(Contrastive Language-Image Pre-Training) model has drawn widespread attention to text-driven image style transfer. However, existing methods are prone to content distortion when generating images and the transfer process is time-consuming. In this paper, we present DisenStyler, a novel Text-Driven Fast Image Stylization using Content Disentanglement and Style Adaptive Matching. The Global-Local Feature Disentanglement and Fusion (GLFDF) to fuse the content features extracted from the frequency and the spatial, enabling the detail information of the generated images can be well preserved. Furthermore, the Style Adaptive Matching Module (SAMM) is designed to map text features into the image space and conduct style adaptive matching by utilizing the means and variances of text and images. This not only significantly improves the speed of style transfer but also optimizes the local stylization effect of the generated images. Qualitative and quantitative experimental results show that the DisenStyler can better balance the content and style of the generated images while achieving fast image stylization.
期刊介绍:
Computers & Graphics is dedicated to disseminate information on research and applications of computer graphics (CG) techniques. The journal encourages articles on:
1. Research and applications of interactive computer graphics. We are particularly interested in novel interaction techniques and applications of CG to problem domains.
2. State-of-the-art papers on late-breaking, cutting-edge research on CG.
3. Information on innovative uses of graphics principles and technologies.
4. Tutorial papers on both teaching CG principles and innovative uses of CG in education.