Local part attention for image stylization with text prompt

Quoc-Truong Truong, Vinh-Tiep Nguyen, Lan-Phuong Nguyen, Hung-Phu Cao, Duc-Tuan Luu
{"title":"Local part attention for image stylization with text prompt","authors":"Quoc-Truong Truong, Vinh-Tiep Nguyen, Lan-Phuong Nguyen, Hung-Phu Cao, Duc-Tuan Luu","doi":"10.1007/s00521-024-10394-w","DOIUrl":null,"url":null,"abstract":"<p>Prompt-based portrait image style transfer aims at translating an input content image to a desired style described by text without a style image. In many practical situations, users may not only attend to the entire portrait image but also the local parts (e.g., eyes, lips, and hair). To address such applications, we propose a new framework that enables style transfer on specific regions described by a text description of the desired style. Specifically, we incorporate semantic segmentation to identify the intended area without requiring edit masks from the user while utilizing a pre-trained CLIP-based model for stylizing. Besides, we propose a text-to-patch matching loss by randomly dividing the stylized image into smaller patches to ensure the consistent quality of the result. To comprehensively evaluate the proposed method, we use several metrics, such as FID, SSIM, and PSNR on a dataset consisting of portraits from the CelebAMask-HQ dataset and style descriptions of other related works. Extensive experimental results demonstrate that our framework outperforms other state-of-the-art methods in terms of both stylization quality and inference time.</p>","PeriodicalId":18925,"journal":{"name":"Neural Computing and Applications","volume":"15 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Computing and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s00521-024-10394-w","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Prompt-based portrait image style transfer aims at translating an input content image to a desired style described by text without a style image. In many practical situations, users may not only attend to the entire portrait image but also the local parts (e.g., eyes, lips, and hair). To address such applications, we propose a new framework that enables style transfer on specific regions described by a text description of the desired style. Specifically, we incorporate semantic segmentation to identify the intended area without requiring edit masks from the user while utilizing a pre-trained CLIP-based model for stylizing. Besides, we propose a text-to-patch matching loss by randomly dividing the stylized image into smaller patches to ensure the consistent quality of the result. To comprehensively evaluate the proposed method, we use several metrics, such as FID, SSIM, and PSNR on a dataset consisting of portraits from the CelebAMask-HQ dataset and style descriptions of other related works. Extensive experimental results demonstrate that our framework outperforms other state-of-the-art methods in terms of both stylization quality and inference time.

Abstract Image

通过文本提示实现图像风格化的局部关注
基于提示的肖像图像风格转换旨在将输入的内容图像转换为由文字描述的所需风格,而无需风格图像。在许多实际情况下,用户可能不仅关注整个肖像图像,还关注局部(如眼睛、嘴唇和头发)。针对此类应用,我们提出了一种新的框架,可在由所需风格的文字描述所描述的特定区域进行风格转移。具体来说,我们结合了语义分割技术来识别目标区域,而不需要用户提供编辑掩码,同时利用预先训练好的基于 CLIP 的模型来进行风格化。此外,我们还提出了一种文本到补丁的匹配损失方法,即随机将风格化图像分割成更小的补丁,以确保结果质量的一致性。为了全面评估所提出的方法,我们在由 CelebAMask-HQ 数据集和其他相关作品的风格描述组成的数据集上使用了 FID、SSIM 和 PSNR 等多个指标。广泛的实验结果表明,我们的框架在风格化质量和推理时间方面都优于其他最先进的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信