无约束环境下三维人脸重建和密集对齐的多层次语义和空间层次推理

IF 7.2 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Lei Li , Fuqiang Liu , Junyuan Wang , Yanni Wang , Zhitao Zhang , Jiahao Li , Qi Wang
{"title":"无约束环境下三维人脸重建和密集对齐的多层次语义和空间层次推理","authors":"Lei Li ,&nbsp;Fuqiang Liu ,&nbsp;Junyuan Wang ,&nbsp;Yanni Wang ,&nbsp;Zhitao Zhang ,&nbsp;Jiahao Li ,&nbsp;Qi Wang","doi":"10.1016/j.asoc.2025.113327","DOIUrl":null,"url":null,"abstract":"<div><div>3D face reconstruction and dense alignment tasks encounter significant challenges when the samples are in highly unconstrained environments, particularly with large poses, extreme expressions, occlusions and complex backgrounds. Although carefully designing the neural network architecture can enhance the representation ability, its performance is still unsatisfactory due to the absence of accurate facial semantic and spatial hierarchy information. To address this challenge, we propose an approach that uses neural networks to capture multi-level facial semantic and spatial hierarchy knowledge so as to guide the learning process. Specifically, our approach, referred to as Multi-Level Semantic and Spatial Hierarchy Reasoning Network (MSHRNet), leverages a point-to-space level progressive face structure loss function to precisely learn the semantic and spatial hierarchy knowledge of different facial parts. This knowledge is then injected into the backbone network through multi-level hierarchy knowledge matrices to incorporate structural reasoning knowledge, which can suppress the effects of large poses, extreme expressions, and occlusions. Moreover, we introduce a sample-to-dataset level data augmentation module that effectively yields rich and diverse semantic and spatial hierarchy information to inhibit occlusions and complex backgrounds while learning fine-grained local details. Extensive quantitative and qualitative experiments on benchmark datasets demonstrate that our MSHRNet outperforms the state-of-the-art methods in terms of both accuracy and computational complexity at the cost of little increase in the number of parameters. Codes and all data are publicly available at <span><span>https://github.com/Ray-tju/MSHRNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"180 ","pages":"Article 113327"},"PeriodicalIF":7.2000,"publicationDate":"2025-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-Level semantic and spatial hierarchy reasoning for 3D face reconstruction and dense alignment in unconstrained environments\",\"authors\":\"Lei Li ,&nbsp;Fuqiang Liu ,&nbsp;Junyuan Wang ,&nbsp;Yanni Wang ,&nbsp;Zhitao Zhang ,&nbsp;Jiahao Li ,&nbsp;Qi Wang\",\"doi\":\"10.1016/j.asoc.2025.113327\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>3D face reconstruction and dense alignment tasks encounter significant challenges when the samples are in highly unconstrained environments, particularly with large poses, extreme expressions, occlusions and complex backgrounds. Although carefully designing the neural network architecture can enhance the representation ability, its performance is still unsatisfactory due to the absence of accurate facial semantic and spatial hierarchy information. To address this challenge, we propose an approach that uses neural networks to capture multi-level facial semantic and spatial hierarchy knowledge so as to guide the learning process. Specifically, our approach, referred to as Multi-Level Semantic and Spatial Hierarchy Reasoning Network (MSHRNet), leverages a point-to-space level progressive face structure loss function to precisely learn the semantic and spatial hierarchy knowledge of different facial parts. This knowledge is then injected into the backbone network through multi-level hierarchy knowledge matrices to incorporate structural reasoning knowledge, which can suppress the effects of large poses, extreme expressions, and occlusions. Moreover, we introduce a sample-to-dataset level data augmentation module that effectively yields rich and diverse semantic and spatial hierarchy information to inhibit occlusions and complex backgrounds while learning fine-grained local details. Extensive quantitative and qualitative experiments on benchmark datasets demonstrate that our MSHRNet outperforms the state-of-the-art methods in terms of both accuracy and computational complexity at the cost of little increase in the number of parameters. Codes and all data are publicly available at <span><span>https://github.com/Ray-tju/MSHRNet</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50737,\"journal\":{\"name\":\"Applied Soft Computing\",\"volume\":\"180 \",\"pages\":\"Article 113327\"},\"PeriodicalIF\":7.2000,\"publicationDate\":\"2025-06-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Soft Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1568494625006386\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1568494625006386","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

当样本处于高度不受约束的环境中时,特别是在大姿势,极端表情,闭塞和复杂背景下,3D人脸重建和密集对齐任务会遇到重大挑战。虽然经过精心设计的神经网络结构可以提高神经网络的表示能力,但由于缺乏准确的面部语义和空间层次信息,神经网络的表现仍然不尽人意。为了解决这一挑战,我们提出了一种使用神经网络捕获多层次面部语义和空间层次知识的方法,以指导学习过程。具体来说,我们的方法,被称为多层次语义和空间层次推理网络(MSHRNet),利用点到空间级渐进式面部结构损失函数来精确学习不同面部部位的语义和空间层次知识。然后通过多级层次知识矩阵将这些知识注入骨干网络,结合结构推理知识,可以抑制大姿势、极端表情和闭塞的影响。此外,我们引入了一个样本到数据集级别的数据增强模块,该模块有效地产生丰富多样的语义和空间层次信息,以抑制遮挡和复杂的背景,同时学习细粒度的局部细节。在基准数据集上进行的大量定量和定性实验表明,我们的MSHRNet在精度和计算复杂性方面都优于最先进的方法,而参数数量的增加很少。代码和所有数据可在https://github.com/Ray-tju/MSHRNet上公开获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Multi-Level semantic and spatial hierarchy reasoning for 3D face reconstruction and dense alignment in unconstrained environments
3D face reconstruction and dense alignment tasks encounter significant challenges when the samples are in highly unconstrained environments, particularly with large poses, extreme expressions, occlusions and complex backgrounds. Although carefully designing the neural network architecture can enhance the representation ability, its performance is still unsatisfactory due to the absence of accurate facial semantic and spatial hierarchy information. To address this challenge, we propose an approach that uses neural networks to capture multi-level facial semantic and spatial hierarchy knowledge so as to guide the learning process. Specifically, our approach, referred to as Multi-Level Semantic and Spatial Hierarchy Reasoning Network (MSHRNet), leverages a point-to-space level progressive face structure loss function to precisely learn the semantic and spatial hierarchy knowledge of different facial parts. This knowledge is then injected into the backbone network through multi-level hierarchy knowledge matrices to incorporate structural reasoning knowledge, which can suppress the effects of large poses, extreme expressions, and occlusions. Moreover, we introduce a sample-to-dataset level data augmentation module that effectively yields rich and diverse semantic and spatial hierarchy information to inhibit occlusions and complex backgrounds while learning fine-grained local details. Extensive quantitative and qualitative experiments on benchmark datasets demonstrate that our MSHRNet outperforms the state-of-the-art methods in terms of both accuracy and computational complexity at the cost of little increase in the number of parameters. Codes and all data are publicly available at https://github.com/Ray-tju/MSHRNet.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Applied Soft Computing
Applied Soft Computing 工程技术-计算机:跨学科应用
CiteScore
15.80
自引率
6.90%
发文量
874
审稿时长
10.9 months
期刊介绍: Applied Soft Computing is an international journal promoting an integrated view of soft computing to solve real life problems.The focus is to publish the highest quality research in application and convergence of the areas of Fuzzy Logic, Neural Networks, Evolutionary Computing, Rough Sets and other similar techniques to address real world complexities. Applied Soft Computing is a rolling publication: articles are published as soon as the editor-in-chief has accepted them. Therefore, the web site will continuously be updated with new articles and the publication time will be short.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信