{"title":"Skin Region Extraction and Person-Independent Deformable Face Templates for Fast Video Indexing","authors":"S. Clippingdale, Mahito Fujii","doi":"10.1109/ISM.2011.75","DOIUrl":null,"url":null,"abstract":"We describe a face tracking and recognition system for video and multimedia indexing that handles face regions at variable face poses (left-right and up-down), and deformations due to facial expressions and speech, by employing person-independent deformable templates at multiple poses on the view-sphere. An earlier version of the system handled variable poses (left-right only) by employing person-specific templates registered for each target individual at multiple poses. The new system speeds up processing by (i) extracting and restricting attention to skin-color regions, (ii) performing recognition using person-specific templates at near-frontal poses only, and (iii) tracking at non-frontal poses using the person-independent templates. Registration is also simplified, since multiple views of each target individual are no longer required, at the cost of a loss of recognition functionality at poses far from frontal (the system instead \"remembers\" the identity of each individual from near-frontal matches and tracks between them). We describe the skin region extraction process and the process by which the person-independent templates are constructed off-line from \"bootstrap\" face images of multiple non-target individuals, and we present experimental results showing the system in operation. Finally we discuss remaining issues in the practical application of the system to video and multimedia archive indexing.","PeriodicalId":339410,"journal":{"name":"2011 IEEE International Symposium on Multimedia","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE International Symposium on Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISM.2011.75","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
We describe a face tracking and recognition system for video and multimedia indexing that handles face regions at variable face poses (left-right and up-down), and deformations due to facial expressions and speech, by employing person-independent deformable templates at multiple poses on the view-sphere. An earlier version of the system handled variable poses (left-right only) by employing person-specific templates registered for each target individual at multiple poses. The new system speeds up processing by (i) extracting and restricting attention to skin-color regions, (ii) performing recognition using person-specific templates at near-frontal poses only, and (iii) tracking at non-frontal poses using the person-independent templates. Registration is also simplified, since multiple views of each target individual are no longer required, at the cost of a loss of recognition functionality at poses far from frontal (the system instead "remembers" the identity of each individual from near-frontal matches and tracks between them). We describe the skin region extraction process and the process by which the person-independent templates are constructed off-line from "bootstrap" face images of multiple non-target individuals, and we present experimental results showing the system in operation. Finally we discuss remaining issues in the practical application of the system to video and multimedia archive indexing.