{"title":"Improved fine-tuning of mask-aware transformer for personalized face inpainting with semantic-aware regularization","authors":"Yuan Zeng , Yijing Sun , Yi Gong","doi":"10.1016/j.patrec.2025.07.009","DOIUrl":null,"url":null,"abstract":"<div><div>Recent advances in generative models have led to significant improvements in the challenging task of high-fidelity image inpainting. How to effectively guide or control these powerful models to perform personalized tasks becomes an important open problem. In this letter, we introduce a semantic-aware fine-tuning method for adapting a pre-trained image inpainting model, mask-aware transformer (MAT), to personalized face inpainting. Unlike existing methods, which tune a personalized generative prior with multiple reference images, our method can recover the key facial features of the individual with only few input references. To improve the fine-tuning stability in a setting with few reference images, we propose a multiscale semantic-aware regularization to encourage the generated key facial components to match those in the reference. Specifically, we generate a mask to extract the key facial components as prior knowledge and impose a semantic-based regularization on these regions at multiple scales, with which the fidelity and identity preservation of facial components are significantly promoted. Extensive experiments demonstrate that our method can generate high-fidelity personalized face inpainting results using only three reference images, which is much fewer than personalized inpainting baselines.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"197 ","pages":"Pages 95-101"},"PeriodicalIF":3.3000,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition Letters","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167865525002612","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Recent advances in generative models have led to significant improvements in the challenging task of high-fidelity image inpainting. How to effectively guide or control these powerful models to perform personalized tasks becomes an important open problem. In this letter, we introduce a semantic-aware fine-tuning method for adapting a pre-trained image inpainting model, mask-aware transformer (MAT), to personalized face inpainting. Unlike existing methods, which tune a personalized generative prior with multiple reference images, our method can recover the key facial features of the individual with only few input references. To improve the fine-tuning stability in a setting with few reference images, we propose a multiscale semantic-aware regularization to encourage the generated key facial components to match those in the reference. Specifically, we generate a mask to extract the key facial components as prior knowledge and impose a semantic-based regularization on these regions at multiple scales, with which the fidelity and identity preservation of facial components are significantly promoted. Extensive experiments demonstrate that our method can generate high-fidelity personalized face inpainting results using only three reference images, which is much fewer than personalized inpainting baselines.
期刊介绍:
Pattern Recognition Letters aims at rapid publication of concise articles of a broad interest in pattern recognition.
Subject areas include all the current fields of interest represented by the Technical Committees of the International Association of Pattern Recognition, and other developing themes involving learning and recognition.