Hierarchical vectorization for facial images

IF 18.3 3区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computational Visual Media Pub Date : 2023-11-30 DOI:10.1007/s41095-022-0314-4

Qian Fu, Linlin Liu, Fei Hou, Ying He

{"title":"Hierarchical vectorization for facial images","authors":"Qian Fu, Linlin Liu, Fei Hou, Ying He","doi":"10.1007/s41095-022-0314-4","DOIUrl":null,"url":null,"abstract":"<p>The explosive growth of social media means portrait editing and retouching are in high demand. While portraits are commonly captured and stored as raster images, editing raster images is non-trivial and requires the user to be highly skilled. Aiming at developing intuitive and easy-to-use portrait editing tools, we propose a novel vectorization method that can automatically convert raster images into a 3-tier hierarchical representation. The base layer consists of a set of sparse diffusion curves (DCs) which characterize salient geometric features and low-frequency colors, providing a means for semantic color transfer and facial expression editing. The middle level encodes specular highlights and shadows as large, editable Poisson regions (PRs) and allows the user to directly adjust illumination by tuning the strength and changing the shapes of PRs. The top level contains two types of pixel-sized PRs for high-frequency residuals and fine details such as pimples and pigmentation. We train a deep generative model that can produce high-frequency residuals automatically. Thanks to the inherent meaning in vector primitives, editing portraits becomes easy and intuitive. In particular, our method supports color transfer, facial expression editing, highlight and shadow editing, and automatic retouching. To quantitatively evaluate the results, we extend the commonly used FLIP metric (which measures color and feature differences between two images) to consider illumination. The new metric, illumination-sensitive FLIP, can effectively capture salient changes in color transfer results, and is more consistent with human perception than FLIP and other quality measures for portrait images. We evaluate our method on the FFHQR dataset and show it to be effective for common portrait editing tasks, such as retouching, light editing, color transfer, and expression editing.\n</p>","PeriodicalId":37301,"journal":{"name":"Computational Visual Media","volume":"7 1","pages":""},"PeriodicalIF":18.3000,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Visual Media","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s41095-022-0314-4","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

The explosive growth of social media means portrait editing and retouching are in high demand. While portraits are commonly captured and stored as raster images, editing raster images is non-trivial and requires the user to be highly skilled. Aiming at developing intuitive and easy-to-use portrait editing tools, we propose a novel vectorization method that can automatically convert raster images into a 3-tier hierarchical representation. The base layer consists of a set of sparse diffusion curves (DCs) which characterize salient geometric features and low-frequency colors, providing a means for semantic color transfer and facial expression editing. The middle level encodes specular highlights and shadows as large, editable Poisson regions (PRs) and allows the user to directly adjust illumination by tuning the strength and changing the shapes of PRs. The top level contains two types of pixel-sized PRs for high-frequency residuals and fine details such as pimples and pigmentation. We train a deep generative model that can produce high-frequency residuals automatically. Thanks to the inherent meaning in vector primitives, editing portraits becomes easy and intuitive. In particular, our method supports color transfer, facial expression editing, highlight and shadow editing, and automatic retouching. To quantitatively evaluate the results, we extend the commonly used FLIP metric (which measures color and feature differences between two images) to consider illumination. The new metric, illumination-sensitive FLIP, can effectively capture salient changes in color transfer results, and is more consistent with human perception than FLIP and other quality measures for portrait images. We evaluate our method on the FFHQR dataset and show it to be effective for common portrait editing tasks, such as retouching, light editing, color transfer, and expression editing.

Abstract Image

查看原文本刊更多论文

面部图像的分层矢量化

社交媒体的爆炸式增长意味着对肖像编辑和润饰的需求很大。虽然人像通常以光栅图像的形式采集和存储，但编辑光栅图像并非易事，需要用户具备高超的技能。为了开发直观易用的肖像编辑工具，我们提出了一种新颖的矢量化方法，可自动将光栅图像转换为三层分级表示。底层由一组稀疏的扩散曲线（DC）组成，这些曲线描述了突出的几何特征和低频色彩，为语义色彩转换和面部表情编辑提供了一种手段。中间层将镜面高光和阴影编码为大型、可编辑的泊松区域（PR），用户可以通过调整泊松区域的强度和形状直接调整光照度。顶层包含两类像素大小的 PR，分别用于处理高频残差以及痘痘和色素沉着等精细细节。我们训练的深度生成模型可以自动生成高频残差。得益于矢量基元的固有意义，人像编辑变得简单而直观。特别是，我们的方法支持色彩转换、面部表情编辑、高光和阴影编辑以及自动修饰。为了定量评估结果，我们扩展了常用的 FLIP 指标（用于测量两幅图像之间的颜色和特征差异），将光照也考虑在内。新指标--光照敏感 FLIP 能有效捕捉色彩转换结果中的显著变化，与 FLIP 和其他人像图像质量指标相比，更符合人类的感知。我们在 FFHQR 数据集上对我们的方法进行了评估，结果表明它对常见的人像编辑任务，如修饰、光线编辑、色彩转换和表情编辑等都很有效。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computational Visual Media Computer Science-Computer Graphics and Computer-Aided Design

CiteScore

16.90

自引率

5.80%

发文量

243

审稿时长

6 weeks

期刊介绍： Computational Visual Media is a peer-reviewed open access journal. It publishes original high-quality research papers and significant review articles on novel ideas, methods, and systems relevant to visual media. Computational Visual Media publishes articles that focus on, but are not limited to, the following areas: • Editing and composition of visual media • Geometric computing for images and video • Geometry modeling and processing • Machine learning for visual media • Physically based animation • Realistic rendering • Recognition and understanding of visual media • Visual computing for robotics • Visualization and visual analytics Other interdisciplinary research into visual media that combines aspects of computer graphics, computer vision, image and video processing, geometric computing, and machine learning is also within the journal''s scope. This is an open access journal, published quarterly by Tsinghua University Press and Springer. The open access fees (article-processing charges) are fully sponsored by Tsinghua University, China. Authors can publish in the journal without any additional charges.