无约束环境下三维人脸重建和密集对齐的多层次语义和空间层次推理

IF 7.2 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Applied Soft Computing Pub Date : 2025-06-14 DOI:10.1016/j.asoc.2025.113327

Lei Li , Fuqiang Liu , Junyuan Wang , Yanni Wang , Zhitao Zhang , Jiahao Li , Qi Wang

{"title":"无约束环境下三维人脸重建和密集对齐的多层次语义和空间层次推理","authors":"Lei Li , Fuqiang Liu , Junyuan Wang , Yanni Wang , Zhitao Zhang , Jiahao Li , Qi Wang","doi":"10.1016/j.asoc.2025.113327","DOIUrl":null,"url":null,"abstract":"<div><div>3D face reconstruction and dense alignment tasks encounter significant challenges when the samples are in highly unconstrained environments, particularly with large poses, extreme expressions, occlusions and complex backgrounds. Although carefully designing the neural network architecture can enhance the representation ability, its performance is still unsatisfactory due to the absence of accurate facial semantic and spatial hierarchy information. To address this challenge, we propose an approach that uses neural networks to capture multi-level facial semantic and spatial hierarchy knowledge so as to guide the learning process. Specifically, our approach, referred to as Multi-Level Semantic and Spatial Hierarchy Reasoning Network (MSHRNet), leverages a point-to-space level progressive face structure loss function to precisely learn the semantic and spatial hierarchy knowledge of different facial parts. This knowledge is then injected into the backbone network through multi-level hierarchy knowledge matrices to incorporate structural reasoning knowledge, which can suppress the effects of large poses, extreme expressions, and occlusions. Moreover, we introduce a sample-to-dataset level data augmentation module that effectively yields rich and diverse semantic and spatial hierarchy information to inhibit occlusions and complex backgrounds while learning fine-grained local details. Extensive quantitative and qualitative experiments on benchmark datasets demonstrate that our MSHRNet outperforms the state-of-the-art methods in terms of both accuracy and computational complexity at the cost of little increase in the number of parameters. Codes and all data are publicly available at <span><span>https://github.com/Ray-tju/MSHRNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"180 ","pages":"Article 113327"},"PeriodicalIF":7.2000,"publicationDate":"2025-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-Level semantic and spatial hierarchy reasoning for 3D face reconstruction and dense alignment in unconstrained environments\",\"authors\":\"Lei Li , Fuqiang Liu , Junyuan Wang , Yanni Wang , Zhitao Zhang , Jiahao Li , Qi Wang\",\"doi\":\"10.1016/j.asoc.2025.113327\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>3D face reconstruction and dense alignment tasks encounter significant challenges when the samples are in highly unconstrained environments, particularly with large poses, extreme expressions, occlusions and complex backgrounds. Although carefully designing the neural network architecture can enhance the representation ability, its performance is still unsatisfactory due to the absence of accurate facial semantic and spatial hierarchy information. To address this challenge, we propose an approach that uses neural networks to capture multi-level facial semantic and spatial hierarchy knowledge so as to guide the learning process. Specifically, our approach, referred to as Multi-Level Semantic and Spatial Hierarchy Reasoning Network (MSHRNet), leverages a point-to-space level progressive face structure loss function to precisely learn the semantic and spatial hierarchy knowledge of different facial parts. This knowledge is then injected into the backbone network through multi-level hierarchy knowledge matrices to incorporate structural reasoning knowledge, which can suppress the effects of large poses, extreme expressions, and occlusions. Moreover, we introduce a sample-to-dataset level data augmentation module that effectively yields rich and diverse semantic and spatial hierarchy information to inhibit occlusions and complex backgrounds while learning fine-grained local details. Extensive quantitative and qualitative experiments on benchmark datasets demonstrate that our MSHRNet outperforms the state-of-the-art methods in terms of both accuracy and computational complexity at the cost of little increase in the number of parameters. Codes and all data are publicly available at <span><span>https://github.com/Ray-tju/MSHRNet</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50737,\"journal\":{\"name\":\"Applied Soft Computing\",\"volume\":\"180 \",\"pages\":\"Article 113327\"},\"PeriodicalIF\":7.2000,\"publicationDate\":\"2025-06-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Soft Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1568494625006386\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1568494625006386","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

当样本处于高度不受约束的环境中时，特别是在大姿势，极端表情，闭塞和复杂背景下，3D人脸重建和密集对齐任务会遇到重大挑战。虽然经过精心设计的神经网络结构可以提高神经网络的表示能力，但由于缺乏准确的面部语义和空间层次信息，神经网络的表现仍然不尽人意。为了解决这一挑战，我们提出了一种使用神经网络捕获多层次面部语义和空间层次知识的方法，以指导学习过程。具体来说，我们的方法，被称为多层次语义和空间层次推理网络（MSHRNet），利用点到空间级渐进式面部结构损失函数来精确学习不同面部部位的语义和空间层次知识。然后通过多级层次知识矩阵将这些知识注入骨干网络，结合结构推理知识，可以抑制大姿势、极端表情和闭塞的影响。此外，我们引入了一个样本到数据集级别的数据增强模块，该模块有效地产生丰富多样的语义和空间层次信息，以抑制遮挡和复杂的背景，同时学习细粒度的局部细节。在基准数据集上进行的大量定量和定性实验表明，我们的MSHRNet在精度和计算复杂性方面都优于最先进的方法，而参数数量的增加很少。代码和所有数据可在https://github.com/Ray-tju/MSHRNet上公开获取。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Multi-Level semantic and spatial hierarchy reasoning for 3D face reconstruction and dense alignment in unconstrained environments

3D face reconstruction and dense alignment tasks encounter significant challenges when the samples are in highly unconstrained environments, particularly with large poses, extreme expressions, occlusions and complex backgrounds. Although carefully designing the neural network architecture can enhance the representation ability, its performance is still unsatisfactory due to the absence of accurate facial semantic and spatial hierarchy information. To address this challenge, we propose an approach that uses neural networks to capture multi-level facial semantic and spatial hierarchy knowledge so as to guide the learning process. Specifically, our approach, referred to as Multi-Level Semantic and Spatial Hierarchy Reasoning Network (MSHRNet), leverages a point-to-space level progressive face structure loss function to precisely learn the semantic and spatial hierarchy knowledge of different facial parts. This knowledge is then injected into the backbone network through multi-level hierarchy knowledge matrices to incorporate structural reasoning knowledge, which can suppress the effects of large poses, extreme expressions, and occlusions. Moreover, we introduce a sample-to-dataset level data augmentation module that effectively yields rich and diverse semantic and spatial hierarchy information to inhibit occlusions and complex backgrounds while learning fine-grained local details. Extensive quantitative and qualitative experiments on benchmark datasets demonstrate that our MSHRNet outperforms the state-of-the-art methods in terms of both accuracy and computational complexity at the cost of little increase in the number of parameters. Codes and all data are publicly available at https://github.com/Ray-tju/MSHRNet.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Applied Soft Computing 工程技术-计算机：跨学科应用

CiteScore

15.80

自引率

6.90%

发文量

874

审稿时长

10.9 months

期刊介绍： Applied Soft Computing is an international journal promoting an integrated view of soft computing to solve real life problems.The focus is to publish the highest quality research in application and convergence of the areas of Fuzzy Logic, Neural Networks, Evolutionary Computing, Rough Sets and other similar techniques to address real world complexities. Applied Soft Computing is a rolling publication: articles are published as soon as the editor-in-chief has accepted them. Therefore, the web site will continuously be updated with new articles and the publication time will be short.