基于模型跟踪的说话脸的形状和外观模型

2003 IEEE International SOI Conference. Proceedings (Cat. No.03CH37443) Pub Date : 2003-10-17 DOI:10.1109/AMFG.2003.1240836

M. Odisio, G. Bailly

{"title":"基于模型跟踪的说话脸的形状和外观模型","authors":"M. Odisio, G. Bailly","doi":"10.1109/AMFG.2003.1240836","DOIUrl":null,"url":null,"abstract":"We present a system that can recover and track the 3D speech movements of a speaker's face for each image of a monocular sequence. A speaker-specific face model is used for tracking: model parameters are extracted from each image by an analysis-by-synthesis loop. To handle both the individual specificities of the speaker's articulation and the complexity of the facial deformations during speech, speaker-specific models of the face 3D geometry and appearance are built from real data. The geometric model is linearly controlled by only six articulatory parameters. Appearance is seen either as a classical texture map or through local appearance of a relevant subset of 3D points. We compare several appearance models: they are either constant or depend linearly on the articulatory parameters. We evaluate these different appearance models with ground truth data.","PeriodicalId":388409,"journal":{"name":"2003 IEEE International SOI Conference. Proceedings (Cat. No.03CH37443)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":"{\"title\":\"Shape and appearance models of talking faces for model-based tracking\",\"authors\":\"M. Odisio, G. Bailly\",\"doi\":\"10.1109/AMFG.2003.1240836\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present a system that can recover and track the 3D speech movements of a speaker's face for each image of a monocular sequence. A speaker-specific face model is used for tracking: model parameters are extracted from each image by an analysis-by-synthesis loop. To handle both the individual specificities of the speaker's articulation and the complexity of the facial deformations during speech, speaker-specific models of the face 3D geometry and appearance are built from real data. The geometric model is linearly controlled by only six articulatory parameters. Appearance is seen either as a classical texture map or through local appearance of a relevant subset of 3D points. We compare several appearance models: they are either constant or depend linearly on the articulatory parameters. We evaluate these different appearance models with ground truth data.\",\"PeriodicalId\":388409,\"journal\":{\"name\":\"2003 IEEE International SOI Conference. Proceedings (Cat. No.03CH37443)\",\"volume\":\"43 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2003-10-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"16\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2003 IEEE International SOI Conference. Proceedings (Cat. No.03CH37443)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AMFG.2003.1240836\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2003 IEEE International SOI Conference. Proceedings (Cat. No.03CH37443)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AMFG.2003.1240836","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 16

摘要

我们提出了一个系统，可以恢复和跟踪说话者的面部的三维语音运动的单眼序列的每个图像。使用特定于说话人的面部模型进行跟踪:通过合成分析回路从每张图像中提取模型参数。为了处理说话者发音的个体特殊性和说话过程中面部变形的复杂性，根据真实数据建立了特定于说话者的面部三维几何和外观模型。几何模型仅由六个关节参数线性控制。外观可以看作是一个经典的纹理贴图，也可以看作是3D点的一个相关子集的局部外观。我们比较了几种外观模型:它们要么是恒定的，要么线性地依赖于发音参数。我们用地面真值数据评估这些不同的外观模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Shape and appearance models of talking faces for model-based tracking

We present a system that can recover and track the 3D speech movements of a speaker's face for each image of a monocular sequence. A speaker-specific face model is used for tracking: model parameters are extracted from each image by an analysis-by-synthesis loop. To handle both the individual specificities of the speaker's articulation and the complexity of the facial deformations during speech, speaker-specific models of the face 3D geometry and appearance are built from real data. The geometric model is linearly controlled by only six articulatory parameters. Appearance is seen either as a classical texture map or through local appearance of a relevant subset of 3D points. We compare several appearance models: they are either constant or depend linearly on the articulatory parameters. We evaluate these different appearance models with ground truth data.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2003 IEEE International SOI Conference. Proceedings (Cat. No.03CH37443)

自引率

0.00%

发文量