Yao-Jen Chang, C. Hsieh, Pei-Wei Hsu, Yung-Chang Chen
{"title":"虚拟会议系统的语音辅助面部表情分析与合成","authors":"Yao-Jen Chang, C. Hsieh, Pei-Wei Hsu, Yung-Chang Chen","doi":"10.1109/ICME.2003.1221365","DOIUrl":null,"url":null,"abstract":"Fast, reliable, and marker-free facial expression analysis still remains to be a difficult task in computer vision research. In this paper, the concept of speech-assisted facial expression analysis and synthesis is proposed, which shows that the speech-driven facial animation technique not only can be used for expression analysis. From the input speech, the mouth shape can is estimated from the audio-visual model. Thus, the large search space of mouth appearance is reduced for mouth tracking. Similarly, the modeling technique is extended from modeling speech and mouth shape to facial movements and detail facial texture changes. In this way, a virtual conferencing system with video realistic avatars is realized to meet real-time requirement.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Speech-assisted facial expression analysis and synthesis for virtual conferencing systems\",\"authors\":\"Yao-Jen Chang, C. Hsieh, Pei-Wei Hsu, Yung-Chang Chen\",\"doi\":\"10.1109/ICME.2003.1221365\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Fast, reliable, and marker-free facial expression analysis still remains to be a difficult task in computer vision research. In this paper, the concept of speech-assisted facial expression analysis and synthesis is proposed, which shows that the speech-driven facial animation technique not only can be used for expression analysis. From the input speech, the mouth shape can is estimated from the audio-visual model. Thus, the large search space of mouth appearance is reduced for mouth tracking. Similarly, the modeling technique is extended from modeling speech and mouth shape to facial movements and detail facial texture changes. In this way, a virtual conferencing system with video realistic avatars is realized to meet real-time requirement.\",\"PeriodicalId\":118560,\"journal\":{\"name\":\"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)\",\"volume\":\"27 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2003-07-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICME.2003.1221365\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICME.2003.1221365","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Speech-assisted facial expression analysis and synthesis for virtual conferencing systems
Fast, reliable, and marker-free facial expression analysis still remains to be a difficult task in computer vision research. In this paper, the concept of speech-assisted facial expression analysis and synthesis is proposed, which shows that the speech-driven facial animation technique not only can be used for expression analysis. From the input speech, the mouth shape can is estimated from the audio-visual model. Thus, the large search space of mouth appearance is reduced for mouth tracking. Similarly, the modeling technique is extended from modeling speech and mouth shape to facial movements and detail facial texture changes. In this way, a virtual conferencing system with video realistic avatars is realized to meet real-time requirement.