Yefei Wang , Kangyue Xiong , Yiyang Yuan, Jinshan Zeng
{"title":"EdgeFont: Enhancing style and content representations in few-shot font generation with multi-scale edge self-supervision","authors":"Yefei Wang , Kangyue Xiong , Yiyang Yuan, Jinshan Zeng","doi":"10.1016/j.eswa.2024.125547","DOIUrl":null,"url":null,"abstract":"<div><div>Font design is a meticulous and resource-intensive endeavor, especially for the intricate Chinese font. The few-shot font generation (FFG), i.e., employing a few reference characters to create diverse characters with the same style, has garnered significant interest recently. Existing models are predominantly based on the aggregation of style and content representations through learning either global or local style representations with neural networks. Yet, existing models commonly lack effective information guidance during the training or necessitate costly character information acquisition, limiting their performance and applicability in more intricate scenarios. To address this issue, this paper proposes a novel self-supervised few-shot font generation model called EdgeFont by introducing a Self-Supervised Multi-Scale edge Information based Self-Supervised (MSEE) module motivated by the observation that the multi-scale edge can simultaneously capture global and local style information. The introduced self-supervised module can not only provide nice supervision for the learning of style and content but also have a low cost of edge extraction. The experimental results on various datasets show that the proposed model outperforms existing models in terms of PSNR, SSIM, MSE, LPIPS, and FID. In the most challenging task for few-shot generation, Unseen fonts and Seen character, the proposed model achieves improvements of 0.95, 0.055, 0.063, 0.085, and 51.73 in PSNR, SSIM, MSE, and LPIPS, respectively, compared to FUNIT. Specifically, after integrating the MESS module into CG-GAN, the FID improves by 4.53, which fully demonstrates the strong scalability of MESS.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"262 ","pages":"Article 125547"},"PeriodicalIF":7.5000,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S095741742402414X","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Font design is a meticulous and resource-intensive endeavor, especially for the intricate Chinese font. The few-shot font generation (FFG), i.e., employing a few reference characters to create diverse characters with the same style, has garnered significant interest recently. Existing models are predominantly based on the aggregation of style and content representations through learning either global or local style representations with neural networks. Yet, existing models commonly lack effective information guidance during the training or necessitate costly character information acquisition, limiting their performance and applicability in more intricate scenarios. To address this issue, this paper proposes a novel self-supervised few-shot font generation model called EdgeFont by introducing a Self-Supervised Multi-Scale edge Information based Self-Supervised (MSEE) module motivated by the observation that the multi-scale edge can simultaneously capture global and local style information. The introduced self-supervised module can not only provide nice supervision for the learning of style and content but also have a low cost of edge extraction. The experimental results on various datasets show that the proposed model outperforms existing models in terms of PSNR, SSIM, MSE, LPIPS, and FID. In the most challenging task for few-shot generation, Unseen fonts and Seen character, the proposed model achieves improvements of 0.95, 0.055, 0.063, 0.085, and 51.73 in PSNR, SSIM, MSE, and LPIPS, respectively, compared to FUNIT. Specifically, after integrating the MESS module into CG-GAN, the FID improves by 4.53, which fully demonstrates the strong scalability of MESS.
期刊介绍:
Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.