Lei Li , Fuqiang Liu , Junyuan Wang , Yanni Wang , Zhitao Zhang , Jiahao Li , Qi Wang
{"title":"无约束环境下三维人脸重建和密集对齐的多层次语义和空间层次推理","authors":"Lei Li , Fuqiang Liu , Junyuan Wang , Yanni Wang , Zhitao Zhang , Jiahao Li , Qi Wang","doi":"10.1016/j.asoc.2025.113327","DOIUrl":null,"url":null,"abstract":"<div><div>3D face reconstruction and dense alignment tasks encounter significant challenges when the samples are in highly unconstrained environments, particularly with large poses, extreme expressions, occlusions and complex backgrounds. Although carefully designing the neural network architecture can enhance the representation ability, its performance is still unsatisfactory due to the absence of accurate facial semantic and spatial hierarchy information. To address this challenge, we propose an approach that uses neural networks to capture multi-level facial semantic and spatial hierarchy knowledge so as to guide the learning process. Specifically, our approach, referred to as Multi-Level Semantic and Spatial Hierarchy Reasoning Network (MSHRNet), leverages a point-to-space level progressive face structure loss function to precisely learn the semantic and spatial hierarchy knowledge of different facial parts. This knowledge is then injected into the backbone network through multi-level hierarchy knowledge matrices to incorporate structural reasoning knowledge, which can suppress the effects of large poses, extreme expressions, and occlusions. Moreover, we introduce a sample-to-dataset level data augmentation module that effectively yields rich and diverse semantic and spatial hierarchy information to inhibit occlusions and complex backgrounds while learning fine-grained local details. Extensive quantitative and qualitative experiments on benchmark datasets demonstrate that our MSHRNet outperforms the state-of-the-art methods in terms of both accuracy and computational complexity at the cost of little increase in the number of parameters. Codes and all data are publicly available at <span><span>https://github.com/Ray-tju/MSHRNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"180 ","pages":"Article 113327"},"PeriodicalIF":7.2000,"publicationDate":"2025-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-Level semantic and spatial hierarchy reasoning for 3D face reconstruction and dense alignment in unconstrained environments\",\"authors\":\"Lei Li , Fuqiang Liu , Junyuan Wang , Yanni Wang , Zhitao Zhang , Jiahao Li , Qi Wang\",\"doi\":\"10.1016/j.asoc.2025.113327\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>3D face reconstruction and dense alignment tasks encounter significant challenges when the samples are in highly unconstrained environments, particularly with large poses, extreme expressions, occlusions and complex backgrounds. Although carefully designing the neural network architecture can enhance the representation ability, its performance is still unsatisfactory due to the absence of accurate facial semantic and spatial hierarchy information. To address this challenge, we propose an approach that uses neural networks to capture multi-level facial semantic and spatial hierarchy knowledge so as to guide the learning process. Specifically, our approach, referred to as Multi-Level Semantic and Spatial Hierarchy Reasoning Network (MSHRNet), leverages a point-to-space level progressive face structure loss function to precisely learn the semantic and spatial hierarchy knowledge of different facial parts. This knowledge is then injected into the backbone network through multi-level hierarchy knowledge matrices to incorporate structural reasoning knowledge, which can suppress the effects of large poses, extreme expressions, and occlusions. Moreover, we introduce a sample-to-dataset level data augmentation module that effectively yields rich and diverse semantic and spatial hierarchy information to inhibit occlusions and complex backgrounds while learning fine-grained local details. Extensive quantitative and qualitative experiments on benchmark datasets demonstrate that our MSHRNet outperforms the state-of-the-art methods in terms of both accuracy and computational complexity at the cost of little increase in the number of parameters. Codes and all data are publicly available at <span><span>https://github.com/Ray-tju/MSHRNet</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50737,\"journal\":{\"name\":\"Applied Soft Computing\",\"volume\":\"180 \",\"pages\":\"Article 113327\"},\"PeriodicalIF\":7.2000,\"publicationDate\":\"2025-06-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Soft Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1568494625006386\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1568494625006386","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Multi-Level semantic and spatial hierarchy reasoning for 3D face reconstruction and dense alignment in unconstrained environments
3D face reconstruction and dense alignment tasks encounter significant challenges when the samples are in highly unconstrained environments, particularly with large poses, extreme expressions, occlusions and complex backgrounds. Although carefully designing the neural network architecture can enhance the representation ability, its performance is still unsatisfactory due to the absence of accurate facial semantic and spatial hierarchy information. To address this challenge, we propose an approach that uses neural networks to capture multi-level facial semantic and spatial hierarchy knowledge so as to guide the learning process. Specifically, our approach, referred to as Multi-Level Semantic and Spatial Hierarchy Reasoning Network (MSHRNet), leverages a point-to-space level progressive face structure loss function to precisely learn the semantic and spatial hierarchy knowledge of different facial parts. This knowledge is then injected into the backbone network through multi-level hierarchy knowledge matrices to incorporate structural reasoning knowledge, which can suppress the effects of large poses, extreme expressions, and occlusions. Moreover, we introduce a sample-to-dataset level data augmentation module that effectively yields rich and diverse semantic and spatial hierarchy information to inhibit occlusions and complex backgrounds while learning fine-grained local details. Extensive quantitative and qualitative experiments on benchmark datasets demonstrate that our MSHRNet outperforms the state-of-the-art methods in terms of both accuracy and computational complexity at the cost of little increase in the number of parameters. Codes and all data are publicly available at https://github.com/Ray-tju/MSHRNet.
期刊介绍:
Applied Soft Computing is an international journal promoting an integrated view of soft computing to solve real life problems.The focus is to publish the highest quality research in application and convergence of the areas of Fuzzy Logic, Neural Networks, Evolutionary Computing, Rough Sets and other similar techniques to address real world complexities.
Applied Soft Computing is a rolling publication: articles are published as soon as the editor-in-chief has accepted them. Therefore, the web site will continuously be updated with new articles and the publication time will be short.