Skin Region Extraction and Person-Independent Deformable Face Templates for Fast Video Indexing

2011 IEEE International Symposium on Multimedia Pub Date : 2011-12-05 DOI:10.1109/ISM.2011.75

S. Clippingdale, Mahito Fujii

{"title":"Skin Region Extraction and Person-Independent Deformable Face Templates for Fast Video Indexing","authors":"S. Clippingdale, Mahito Fujii","doi":"10.1109/ISM.2011.75","DOIUrl":null,"url":null,"abstract":"We describe a face tracking and recognition system for video and multimedia indexing that handles face regions at variable face poses (left-right and up-down), and deformations due to facial expressions and speech, by employing person-independent deformable templates at multiple poses on the view-sphere. An earlier version of the system handled variable poses (left-right only) by employing person-specific templates registered for each target individual at multiple poses. The new system speeds up processing by (i) extracting and restricting attention to skin-color regions, (ii) performing recognition using person-specific templates at near-frontal poses only, and (iii) tracking at non-frontal poses using the person-independent templates. Registration is also simplified, since multiple views of each target individual are no longer required, at the cost of a loss of recognition functionality at poses far from frontal (the system instead \"remembers\" the identity of each individual from near-frontal matches and tracks between them). We describe the skin region extraction process and the process by which the person-independent templates are constructed off-line from \"bootstrap\" face images of multiple non-target individuals, and we present experimental results showing the system in operation. Finally we discuss remaining issues in the practical application of the system to video and multimedia archive indexing.","PeriodicalId":339410,"journal":{"name":"2011 IEEE International Symposium on Multimedia","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE International Symposium on Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISM.2011.75","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

We describe a face tracking and recognition system for video and multimedia indexing that handles face regions at variable face poses (left-right and up-down), and deformations due to facial expressions and speech, by employing person-independent deformable templates at multiple poses on the view-sphere. An earlier version of the system handled variable poses (left-right only) by employing person-specific templates registered for each target individual at multiple poses. The new system speeds up processing by (i) extracting and restricting attention to skin-color regions, (ii) performing recognition using person-specific templates at near-frontal poses only, and (iii) tracking at non-frontal poses using the person-independent templates. Registration is also simplified, since multiple views of each target individual are no longer required, at the cost of a loss of recognition functionality at poses far from frontal (the system instead "remembers" the identity of each individual from near-frontal matches and tracks between them). We describe the skin region extraction process and the process by which the person-independent templates are constructed off-line from "bootstrap" face images of multiple non-target individuals, and we present experimental results showing the system in operation. Finally we discuss remaining issues in the practical application of the system to video and multimedia archive indexing.

查看原文本刊更多论文

用于快速视频索引的皮肤区域提取和独立于人的可变形人脸模板

我们描述了一个用于视频和多媒体索引的人脸跟踪和识别系统，该系统通过在视场上的多个姿态使用独立于人的可变形模板，处理可变面部姿态(左右和上下)的人脸区域，以及由于面部表情和语音而产生的变形。系统的早期版本通过使用针对每个目标个体的多个姿势注册的个人特定模板来处理可变姿势(仅限左右)。新系统通过(i)提取和限制对肤色区域的关注，(ii)仅在近正面姿势使用个人特定模板进行识别，以及(iii)使用独立于个人的模板跟踪非正面姿势来加快处理速度。注册也简化了，因为不再需要每个目标个体的多个视图，代价是在远离正面的姿势上失去识别功能(系统取而代之的是“记住”来自近正面匹配的每个个体的身份，并在它们之间跟踪)。我们描述了皮肤区域的提取过程，以及从多个非目标个体的“bootstrap”人脸图像中离线构建与个人无关的模板的过程，并给出了显示该系统运行的实验结果。最后讨论了该系统在视频和多媒体档案标引中的实际应用中有待解决的问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2011 IEEE International Symposium on Multimedia

自引率

0.00%

发文量