Wancheng Feng , Yingchao Liu , Jiaming Pei , Guangliang Cheng , Lukun Wang
{"title":"局部一致性指导:人脸视频个性化风格化方法","authors":"Wancheng Feng , Yingchao Liu , Jiaming Pei , Guangliang Cheng , Lukun Wang","doi":"10.1016/j.cviu.2025.104339","DOIUrl":null,"url":null,"abstract":"<div><div>Face video stylization aims to transform real face videos into specific reference styles. Although image stylization has achieved remarkable results, maintaining continuity and accurately preserving original facial expressions in video stylization remains a significant challenge. This work introduces a novel approach for face video stylization that ensures consistent quality across the entire video by leveraging local consistency. Specifically, the framework builds upon existing diffusion models and employs local consistency as a guiding principle. It integrates a Local-Cross Attention (LCA) module to maintain style consistency between frames and a Local Style Transfer (LST) module to ensure seamless video continuity. Comparative experiments were conducted, along with qualitative and quantitative analyses using frame consistency, SSIM, FID, LPIPS, user studies, and flow similarity parameters. An ablation experiment section is also included. The experimental results demonstrate that the proposed approach effectively achieves continuous video stylization by applying local consistency guidance. Additionally, the Local Consistency Guidance (LCG) method shows strong performance in achieving continuous video stylization. After extensive investigation, this work achieves state-of-the-art results in the field of video stylization. Further information is available on the project homepage at <span><span>https://lcgfacevideostylization.github.io/github.io/</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"257 ","pages":"Article 104339"},"PeriodicalIF":4.3000,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Local Consistency Guidance: Personalized Stylization Method of Face Video\",\"authors\":\"Wancheng Feng , Yingchao Liu , Jiaming Pei , Guangliang Cheng , Lukun Wang\",\"doi\":\"10.1016/j.cviu.2025.104339\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Face video stylization aims to transform real face videos into specific reference styles. Although image stylization has achieved remarkable results, maintaining continuity and accurately preserving original facial expressions in video stylization remains a significant challenge. This work introduces a novel approach for face video stylization that ensures consistent quality across the entire video by leveraging local consistency. Specifically, the framework builds upon existing diffusion models and employs local consistency as a guiding principle. It integrates a Local-Cross Attention (LCA) module to maintain style consistency between frames and a Local Style Transfer (LST) module to ensure seamless video continuity. Comparative experiments were conducted, along with qualitative and quantitative analyses using frame consistency, SSIM, FID, LPIPS, user studies, and flow similarity parameters. An ablation experiment section is also included. The experimental results demonstrate that the proposed approach effectively achieves continuous video stylization by applying local consistency guidance. Additionally, the Local Consistency Guidance (LCG) method shows strong performance in achieving continuous video stylization. After extensive investigation, this work achieves state-of-the-art results in the field of video stylization. Further information is available on the project homepage at <span><span>https://lcgfacevideostylization.github.io/github.io/</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50633,\"journal\":{\"name\":\"Computer Vision and Image Understanding\",\"volume\":\"257 \",\"pages\":\"Article 104339\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2025-04-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Vision and Image Understanding\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1077314225000621\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Vision and Image Understanding","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1077314225000621","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Local Consistency Guidance: Personalized Stylization Method of Face Video
Face video stylization aims to transform real face videos into specific reference styles. Although image stylization has achieved remarkable results, maintaining continuity and accurately preserving original facial expressions in video stylization remains a significant challenge. This work introduces a novel approach for face video stylization that ensures consistent quality across the entire video by leveraging local consistency. Specifically, the framework builds upon existing diffusion models and employs local consistency as a guiding principle. It integrates a Local-Cross Attention (LCA) module to maintain style consistency between frames and a Local Style Transfer (LST) module to ensure seamless video continuity. Comparative experiments were conducted, along with qualitative and quantitative analyses using frame consistency, SSIM, FID, LPIPS, user studies, and flow similarity parameters. An ablation experiment section is also included. The experimental results demonstrate that the proposed approach effectively achieves continuous video stylization by applying local consistency guidance. Additionally, the Local Consistency Guidance (LCG) method shows strong performance in achieving continuous video stylization. After extensive investigation, this work achieves state-of-the-art results in the field of video stylization. Further information is available on the project homepage at https://lcgfacevideostylization.github.io/github.io/.
期刊介绍:
The central focus of this journal is the computer analysis of pictorial information. Computer Vision and Image Understanding publishes papers covering all aspects of image analysis from the low-level, iconic processes of early vision to the high-level, symbolic processes of recognition and interpretation. A wide range of topics in the image understanding area is covered, including papers offering insights that differ from predominant views.
Research Areas Include:
• Theory
• Early vision
• Data structures and representations
• Shape
• Range
• Motion
• Matching and recognition
• Architecture and languages
• Vision systems