面部特征跟踪用于头部运动分析和表情识别

2007 IEEE 9th Workshop on Multimedia Signal Processing Pub Date : 2007-10-01 DOI:10.1109/MMSP.2007.4412803

Dimitris N. Metaxas

{"title":"面部特征跟踪用于头部运动分析和表情识别","authors":"Dimitris N. Metaxas","doi":"10.1109/MMSP.2007.4412803","DOIUrl":null,"url":null,"abstract":"Summary form only given. The tracking and recognition of facial expressions from a single cameras is an important and challenging problem. We present a real-time framework for Action Units(AU)/Expression recognition based on facial features tracking and Adaboost. Accurate facial feature tracking is challenging due to changes in illumination, skin color variations, possible large head rotations, partial occlusions and fast head movements. We use models based on Active Shapes to localize facial features on the face in a generic pose. Shapes of facial features undergo non-linear transformation as the head rotates from frontal view to profile view. We learn the non-linear shape manifold as multiple-overlapping subspaces with different subspaces representing different head poses. The face alignment is done by searching over the non-linear shape manifold and aligning the landmark points to the features' boundaries. The recognized features are tracked across multiple frames using KLT Tracker by constraining the shape to lie on the non-linear manifold. Our tracking framework has been successfully used for detecting both gross head movements, like nodding, shaking and head pose prediction. Further, we use the tracked features to accurately extract bounded faces in a video sequence and use it for recognizing facial expressions. Our approach is based on coded dynamical features. In order to capture the dynamic characteristics of facial events, we design the dynamic haar-like features to represent the temporal variations of facial events. Inspired by the binary pattern coding, we further encode the dynamic haar-like features into binary pattern features, which are useful to construct weak classifiers for boosting learning. Finally Adaboost is used to learn a set of discriminating coded dynamic features for facial active units and expression recognition. We have achieved approximately 97% detection rate for gross head movements like shaking and nodding. The recognition rates for facial expressions averages to -95% for the most important action units.","PeriodicalId":225295,"journal":{"name":"2007 IEEE 9th Workshop on Multimedia Signal Processing","volume":"71 3","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Facial Features Tracking for Gross Head Movement analysis and Expression Recognition\",\"authors\":\"Dimitris N. Metaxas\",\"doi\":\"10.1109/MMSP.2007.4412803\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Summary form only given. The tracking and recognition of facial expressions from a single cameras is an important and challenging problem. We present a real-time framework for Action Units(AU)/Expression recognition based on facial features tracking and Adaboost. Accurate facial feature tracking is challenging due to changes in illumination, skin color variations, possible large head rotations, partial occlusions and fast head movements. We use models based on Active Shapes to localize facial features on the face in a generic pose. Shapes of facial features undergo non-linear transformation as the head rotates from frontal view to profile view. We learn the non-linear shape manifold as multiple-overlapping subspaces with different subspaces representing different head poses. The face alignment is done by searching over the non-linear shape manifold and aligning the landmark points to the features' boundaries. The recognized features are tracked across multiple frames using KLT Tracker by constraining the shape to lie on the non-linear manifold. Our tracking framework has been successfully used for detecting both gross head movements, like nodding, shaking and head pose prediction. Further, we use the tracked features to accurately extract bounded faces in a video sequence and use it for recognizing facial expressions. Our approach is based on coded dynamical features. In order to capture the dynamic characteristics of facial events, we design the dynamic haar-like features to represent the temporal variations of facial events. Inspired by the binary pattern coding, we further encode the dynamic haar-like features into binary pattern features, which are useful to construct weak classifiers for boosting learning. Finally Adaboost is used to learn a set of discriminating coded dynamic features for facial active units and expression recognition. We have achieved approximately 97% detection rate for gross head movements like shaking and nodding. The recognition rates for facial expressions averages to -95% for the most important action units.\",\"PeriodicalId\":225295,\"journal\":{\"name\":\"2007 IEEE 9th Workshop on Multimedia Signal Processing\",\"volume\":\"71 3\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2007 IEEE 9th Workshop on Multimedia Signal Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MMSP.2007.4412803\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 IEEE 9th Workshop on Multimedia Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MMSP.2007.4412803","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

只提供摘要形式。从单个摄像机中跟踪和识别面部表情是一个重要而具有挑战性的问题。我们提出了一个基于面部特征跟踪和Adaboost的动作单元/表情识别的实时框架。由于光照、肤色变化、可能的头部旋转、部分闭塞和头部快速运动的变化，准确的面部特征跟踪具有挑战性。我们使用基于活动形状的模型来定位通用姿势下面部特征。当头部从正面视图旋转到侧面视图时，面部特征的形状发生非线性变换。我们将非线性形状流形学习为多个重叠的子空间，不同的子空间表示不同的头部姿态。人脸对齐是通过搜索非线性形状流形并将地标点对齐到特征边界来完成的。利用KLT跟踪器将识别的特征约束在非线性流形上，从而跨多帧跟踪特征。我们的跟踪框架已经成功地用于检测头部运动，如点头、摇晃和头部姿势预测。此外，我们利用跟踪特征准确提取视频序列中的有界人脸，并将其用于识别面部表情。我们的方法是基于编码的动态特征。为了捕捉面部事件的动态特征，我们设计了动态haar样特征来表示面部事件的时间变化。受二进制模式编码的启发，我们进一步将动态haar样特征编码为二进制模式特征，这有助于构建弱分类器以增强学习。最后利用Adaboost学习一组区分编码的动态特征，用于面部活动单元和表情识别。我们对摇头和点头等头部运动的检测率达到了97%左右。对于最重要的动作单元，面部表情的识别率平均为-95%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Facial Features Tracking for Gross Head Movement analysis and Expression Recognition

Summary form only given. The tracking and recognition of facial expressions from a single cameras is an important and challenging problem. We present a real-time framework for Action Units(AU)/Expression recognition based on facial features tracking and Adaboost. Accurate facial feature tracking is challenging due to changes in illumination, skin color variations, possible large head rotations, partial occlusions and fast head movements. We use models based on Active Shapes to localize facial features on the face in a generic pose. Shapes of facial features undergo non-linear transformation as the head rotates from frontal view to profile view. We learn the non-linear shape manifold as multiple-overlapping subspaces with different subspaces representing different head poses. The face alignment is done by searching over the non-linear shape manifold and aligning the landmark points to the features' boundaries. The recognized features are tracked across multiple frames using KLT Tracker by constraining the shape to lie on the non-linear manifold. Our tracking framework has been successfully used for detecting both gross head movements, like nodding, shaking and head pose prediction. Further, we use the tracked features to accurately extract bounded faces in a video sequence and use it for recognizing facial expressions. Our approach is based on coded dynamical features. In order to capture the dynamic characteristics of facial events, we design the dynamic haar-like features to represent the temporal variations of facial events. Inspired by the binary pattern coding, we further encode the dynamic haar-like features into binary pattern features, which are useful to construct weak classifiers for boosting learning. Finally Adaboost is used to learn a set of discriminating coded dynamic features for facial active units and expression recognition. We have achieved approximately 97% detection rate for gross head movements like shaking and nodding. The recognition rates for facial expressions averages to -95% for the most important action units.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2007 IEEE 9th Workshop on Multimedia Signal Processing

自引率

0.00%

发文量