{"title":"Pyramidal structure-correlated refinement for robust face alignment","authors":"Qiyuan Dai, Qiang Ling","doi":"10.1016/j.knosys.2025.114098","DOIUrl":null,"url":null,"abstract":"<div><div>Recent face alignment methods attempt to capture representations of facial landmarks and learn the correlation between them. However, they often ignore the consistency between local landmarks and the overall face shape, which may lead to the low-efficiency correlation learning between long-distance landmarks. Besides, due to the uncertain localization, these methods may capture invalid local cues of landmark representations. To resolve these issues, we propose a pyramidal structure-correlated refinement method that integrates a novel fusion interactor into a pyramidal refinement framework. Specifically we introduce a fusion interactor to aggregate local regression cues of landmark representations into a global representation and encode the facial structure information. The facial structure information is then allocated to local representations to compensate for missing contexts of landmarks, such as occluded parts. Unlike vanilla attention mechanisms, our fusion interactor performs indirect interaction to avoid inconsistent landmark contexts, and incurs tiny computational complexity burdens. Additionally, to obtain valid local cues of landmarks, we further introduce a pyramidal refinement framework with multi-scale feature maps, which can sample landmark representations from the feature maps of specific scales according to the uncertainty of sampling positions. It can also gradually regularize the global representation with correct multi-scale spatial contexts to constrain the overall face shape. Experiments on some popular benchmarks demonstrate the effectiveness and robustness of our proposed method, especially its notably low failure rates in challenging scenarios.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"327 ","pages":"Article 114098"},"PeriodicalIF":7.6000,"publicationDate":"2025-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705125011438","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Recent face alignment methods attempt to capture representations of facial landmarks and learn the correlation between them. However, they often ignore the consistency between local landmarks and the overall face shape, which may lead to the low-efficiency correlation learning between long-distance landmarks. Besides, due to the uncertain localization, these methods may capture invalid local cues of landmark representations. To resolve these issues, we propose a pyramidal structure-correlated refinement method that integrates a novel fusion interactor into a pyramidal refinement framework. Specifically we introduce a fusion interactor to aggregate local regression cues of landmark representations into a global representation and encode the facial structure information. The facial structure information is then allocated to local representations to compensate for missing contexts of landmarks, such as occluded parts. Unlike vanilla attention mechanisms, our fusion interactor performs indirect interaction to avoid inconsistent landmark contexts, and incurs tiny computational complexity burdens. Additionally, to obtain valid local cues of landmarks, we further introduce a pyramidal refinement framework with multi-scale feature maps, which can sample landmark representations from the feature maps of specific scales according to the uncertainty of sampling positions. It can also gradually regularize the global representation with correct multi-scale spatial contexts to constrain the overall face shape. Experiments on some popular benchmarks demonstrate the effectiveness and robustness of our proposed method, especially its notably low failure rates in challenging scenarios.
期刊介绍:
Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.