EdgeFont: Enhancing style and content representations in few-shot font generation with multi-scale edge self-supervision

IF 7.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Expert Systems with Applications Pub Date : 2024-10-28 DOI:10.1016/j.eswa.2024.125547

Yefei Wang , Kangyue Xiong , Yiyang Yuan, Jinshan Zeng

{"title":"EdgeFont: Enhancing style and content representations in few-shot font generation with multi-scale edge self-supervision","authors":"Yefei Wang , Kangyue Xiong , Yiyang Yuan, Jinshan Zeng","doi":"10.1016/j.eswa.2024.125547","DOIUrl":null,"url":null,"abstract":"<div><div>Font design is a meticulous and resource-intensive endeavor, especially for the intricate Chinese font. The few-shot font generation (FFG), i.e., employing a few reference characters to create diverse characters with the same style, has garnered significant interest recently. Existing models are predominantly based on the aggregation of style and content representations through learning either global or local style representations with neural networks. Yet, existing models commonly lack effective information guidance during the training or necessitate costly character information acquisition, limiting their performance and applicability in more intricate scenarios. To address this issue, this paper proposes a novel self-supervised few-shot font generation model called EdgeFont by introducing a Self-Supervised Multi-Scale edge Information based Self-Supervised (MSEE) module motivated by the observation that the multi-scale edge can simultaneously capture global and local style information. The introduced self-supervised module can not only provide nice supervision for the learning of style and content but also have a low cost of edge extraction. The experimental results on various datasets show that the proposed model outperforms existing models in terms of PSNR, SSIM, MSE, LPIPS, and FID. In the most challenging task for few-shot generation, Unseen fonts and Seen character, the proposed model achieves improvements of 0.95, 0.055, 0.063, 0.085, and 51.73 in PSNR, SSIM, MSE, and LPIPS, respectively, compared to FUNIT. Specifically, after integrating the MESS module into CG-GAN, the FID improves by 4.53, which fully demonstrates the strong scalability of MESS.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"262 ","pages":"Article 125547"},"PeriodicalIF":7.5000,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S095741742402414X","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Font design is a meticulous and resource-intensive endeavor, especially for the intricate Chinese font. The few-shot font generation (FFG), i.e., employing a few reference characters to create diverse characters with the same style, has garnered significant interest recently. Existing models are predominantly based on the aggregation of style and content representations through learning either global or local style representations with neural networks. Yet, existing models commonly lack effective information guidance during the training or necessitate costly character information acquisition, limiting their performance and applicability in more intricate scenarios. To address this issue, this paper proposes a novel self-supervised few-shot font generation model called EdgeFont by introducing a Self-Supervised Multi-Scale edge Information based Self-Supervised (MSEE) module motivated by the observation that the multi-scale edge can simultaneously capture global and local style information. The introduced self-supervised module can not only provide nice supervision for the learning of style and content but also have a low cost of edge extraction. The experimental results on various datasets show that the proposed model outperforms existing models in terms of PSNR, SSIM, MSE, LPIPS, and FID. In the most challenging task for few-shot generation, Unseen fonts and Seen character, the proposed model achieves improvements of 0.95, 0.055, 0.063, 0.085, and 51.73 in PSNR, SSIM, MSE, and LPIPS, respectively, compared to FUNIT. Specifically, after integrating the MESS module into CG-GAN, the FID improves by 4.53, which fully demonstrates the strong scalability of MESS.

查看原文本刊更多论文

边缘字体：利用多尺度边缘自监督功能，在少量字体生成过程中增强样式和内容表示法

字体设计是一项细致而耗费资源的工作，尤其是对于复杂的中文字体而言。少量字体生成（FFG），即使用少量参考字符来创建具有相同风格的各种字符，最近引起了人们的极大兴趣。现有的模型主要基于通过神经网络学习全局或局部风格表征来聚合风格和内容表征。然而，现有模型在训练过程中通常缺乏有效的信息引导，或者需要花费高昂的成本获取角色信息，从而限制了其在更复杂场景中的表现和适用性。为了解决这个问题，本文提出了一种新颖的自监督少次字体生成模型 EdgeFont，该模型引入了基于多尺度边缘信息的自监督（MSEE）模块，其动机是观察到多尺度边缘可以同时捕捉全局和局部风格信息。引入的自监督模块不仅能为风格和内容的学习提供良好的监督，还能降低边缘提取的成本。在各种数据集上的实验结果表明，所提出的模型在 PSNR、SSIM、MSE、LPIPS 和 FID 方面均优于现有模型。与 FUNIT 相比，在最具挑战性的 "Unseen fonts and Seen character "任务中，拟议模型的 PSNR、SSIM、MSE 和 LPIPS 分别提高了 0.95、0.055、0.063、0.085 和 51.73。具体来说，将 MESS 模块集成到 CG-GAN 中后，FID 提高了 4.53，这充分证明了 MESS 强大的可扩展性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Expert Systems with Applications 工程技术-工程：电子与电气

CiteScore

13.80

自引率

10.60%

发文量

2045

审稿时长

8.7 months

期刊介绍： Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.