从一张照片中创建一个语音头像

D. Bitouk, S. Nayar
{"title":"从一张照片中创建一个语音头像","authors":"D. Bitouk, S. Nayar","doi":"10.1109/VR.2008.4480758","DOIUrl":null,"url":null,"abstract":"This paper presents a complete framework for creating a speech-enabled avatar from a single image of a person. Our approach uses a generic facial motion model which represents deformations of a prototype face during speech. We have developed an HMM-based facial animation algorithm which takes into account both lexical stress and coarticulation. This algorithm produces realistic animations of the prototype facial surface from either text or speech. The generic facial motion model can be transformed to a novel face geometry using a set of corresponding points between the prototype face surface and the novel face. Given a face photograph, a small number of manually selected features in the photograph are used to deform the prototype face surface. The deformed surface is then used to animate the face in the photograph. We show several examples of avatars that are driven by text and speech inputs.","PeriodicalId":173744,"journal":{"name":"2008 IEEE Virtual Reality Conference","volume":"54 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Creating a Speech Enabled Avatar from a Single Photograph\",\"authors\":\"D. Bitouk, S. Nayar\",\"doi\":\"10.1109/VR.2008.4480758\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a complete framework for creating a speech-enabled avatar from a single image of a person. Our approach uses a generic facial motion model which represents deformations of a prototype face during speech. We have developed an HMM-based facial animation algorithm which takes into account both lexical stress and coarticulation. This algorithm produces realistic animations of the prototype facial surface from either text or speech. The generic facial motion model can be transformed to a novel face geometry using a set of corresponding points between the prototype face surface and the novel face. Given a face photograph, a small number of manually selected features in the photograph are used to deform the prototype face surface. The deformed surface is then used to animate the face in the photograph. We show several examples of avatars that are driven by text and speech inputs.\",\"PeriodicalId\":173744,\"journal\":{\"name\":\"2008 IEEE Virtual Reality Conference\",\"volume\":\"54 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-03-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 IEEE Virtual Reality Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/VR.2008.4480758\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 IEEE Virtual Reality Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VR.2008.4480758","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

摘要

本文提出了一个完整的框架,用于从一个人的单个图像创建一个具有语音功能的化身。我们的方法使用了一个通用的面部运动模型,该模型代表了说话过程中原型面部的变形。我们开发了一种基于hmm的面部动画算法,该算法同时考虑了词法重音和协同发音。该算法从文本或语音中生成原型面部的逼真动画。利用原型人脸表面与新人脸之间的一组对应点,可以将通用的人脸运动模型转换为新的人脸几何形状。给定一张人脸照片,在照片中手动选择少量特征来变形原型人脸表面。然后用变形的表面使照片中的脸动起来。我们展示了几个由文本和语音输入驱动的化身的例子。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Creating a Speech Enabled Avatar from a Single Photograph
This paper presents a complete framework for creating a speech-enabled avatar from a single image of a person. Our approach uses a generic facial motion model which represents deformations of a prototype face during speech. We have developed an HMM-based facial animation algorithm which takes into account both lexical stress and coarticulation. This algorithm produces realistic animations of the prototype facial surface from either text or speech. The generic facial motion model can be transformed to a novel face geometry using a set of corresponding points between the prototype face surface and the novel face. Given a face photograph, a small number of manually selected features in the photograph are used to deform the prototype face surface. The deformed surface is then used to animate the face in the photograph. We show several examples of avatars that are driven by text and speech inputs.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信