助记下降法:一种用于端到端人脸对齐的循环过程

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-06-27 DOI:10.1109/CVPR.2016.453

George Trigeorgis, Patrick Snape, M. Nicolaou, Epameinondas Antonakos, S. Zafeiriou

{"title":"助记下降法:一种用于端到端人脸对齐的循环过程","authors":"George Trigeorgis, Patrick Snape, M. Nicolaou, Epameinondas Antonakos, S. Zafeiriou","doi":"10.1109/CVPR.2016.453","DOIUrl":null,"url":null,"abstract":"Cascaded regression has recently become the method of choice for solving non-linear least squares problems such as deformable image alignment. Given a sizeable training set, cascaded regression learns a set of generic rules that are sequentially applied to minimise the least squares problem. Despite the success of cascaded regression for problems such as face alignment and head pose estimation, there are several shortcomings arising in the strategies proposed thus far. Specifically, (a) the regressors are learnt independently, (b) the descent directions may cancel one another out and (c) handcrafted features (e.g., HoGs, SIFT etc.) are mainly used to drive the cascade, which may be sub-optimal for the task at hand. In this paper, we propose a combined and jointly trained convolutional recurrent neural network architecture that allows the training of an end-to-end to system that attempts to alleviate the aforementioned drawbacks. The recurrent module facilitates the joint optimisation of the regressors by assuming the cascades form a nonlinear dynamical system, in effect fully utilising the information between all cascade levels by introducing a memory unit that shares information across all levels. The convolutional module allows the network to extract features that are specialised for the task at hand and are experimentally shown to outperform hand-crafted features. We show that the application of the proposed architecture for the problem of face alignment results in a strong improvement over the current state-of-the-art.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"47 1","pages":"4177-4187"},"PeriodicalIF":0.0000,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"340","resultStr":"{\"title\":\"Mnemonic Descent Method: A Recurrent Process Applied for End-to-End Face Alignment\",\"authors\":\"George Trigeorgis, Patrick Snape, M. Nicolaou, Epameinondas Antonakos, S. Zafeiriou\",\"doi\":\"10.1109/CVPR.2016.453\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Cascaded regression has recently become the method of choice for solving non-linear least squares problems such as deformable image alignment. Given a sizeable training set, cascaded regression learns a set of generic rules that are sequentially applied to minimise the least squares problem. Despite the success of cascaded regression for problems such as face alignment and head pose estimation, there are several shortcomings arising in the strategies proposed thus far. Specifically, (a) the regressors are learnt independently, (b) the descent directions may cancel one another out and (c) handcrafted features (e.g., HoGs, SIFT etc.) are mainly used to drive the cascade, which may be sub-optimal for the task at hand. In this paper, we propose a combined and jointly trained convolutional recurrent neural network architecture that allows the training of an end-to-end to system that attempts to alleviate the aforementioned drawbacks. The recurrent module facilitates the joint optimisation of the regressors by assuming the cascades form a nonlinear dynamical system, in effect fully utilising the information between all cascade levels by introducing a memory unit that shares information across all levels. The convolutional module allows the network to extract features that are specialised for the task at hand and are experimentally shown to outperform hand-crafted features. We show that the application of the proposed architecture for the problem of face alignment results in a strong improvement over the current state-of-the-art.\",\"PeriodicalId\":6515,\"journal\":{\"name\":\"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)\",\"volume\":\"47 1\",\"pages\":\"4177-4187\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-06-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"340\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CVPR.2016.453\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR.2016.453","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 340

摘要

近年来，级联回归已成为求解非线性最小二乘问题(如可变形图像对齐)的首选方法。给定一个相当大的训练集，级联回归学习一组通用规则，这些规则依次应用于最小化最小二乘问题。尽管级联回归在人脸对齐和头姿估计等问题上取得了成功，但目前提出的策略存在一些缺点。具体来说，(a)回归量是独立学习的，(b)下降方向可能相互抵消，(c)手工制作的特征(例如，hog, SIFT等)主要用于驱动级联，这对于手头的任务来说可能是次优的。在本文中，我们提出了一个组合和联合训练的卷积递归神经网络架构，该架构允许端到端系统的训练，试图减轻上述缺点。通过假设级联形成一个非线性动力系统，循环模块促进了回归量的联合优化，通过引入一个在所有级联级别之间共享信息的存储单元，有效地充分利用了所有级联级别之间的信息。卷积模块允许网络提取专门用于手头任务的特征，并在实验中证明优于手工制作的特征。我们表明，所提出的结构在人脸对齐问题上的应用比目前的最新技术有了很大的改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Mnemonic Descent Method: A Recurrent Process Applied for End-to-End Face Alignment

Cascaded regression has recently become the method of choice for solving non-linear least squares problems such as deformable image alignment. Given a sizeable training set, cascaded regression learns a set of generic rules that are sequentially applied to minimise the least squares problem. Despite the success of cascaded regression for problems such as face alignment and head pose estimation, there are several shortcomings arising in the strategies proposed thus far. Specifically, (a) the regressors are learnt independently, (b) the descent directions may cancel one another out and (c) handcrafted features (e.g., HoGs, SIFT etc.) are mainly used to drive the cascade, which may be sub-optimal for the task at hand. In this paper, we propose a combined and jointly trained convolutional recurrent neural network architecture that allows the training of an end-to-end to system that attempts to alleviate the aforementioned drawbacks. The recurrent module facilitates the joint optimisation of the regressors by assuming the cascades form a nonlinear dynamical system, in effect fully utilising the information between all cascade levels by introducing a memory unit that shares information across all levels. The convolutional module allows the network to extract features that are specialised for the task at hand and are experimentally shown to outperform hand-crafted features. We show that the application of the proposed architecture for the problem of face alignment results in a strong improvement over the current state-of-the-art.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

自引率

0.00%

发文量