{"title":"Automated gesturing for virtual characters: speech-driven and text-driven approaches","authors":"G. Zoric, K. Smid, I. Pandzic","doi":"10.4304/jmm.1.1.62-68","DOIUrl":null,"url":null,"abstract":"We present two methods for automatic facial gesturing of graphically embodied animated agents. In one case, conversational agent is driven by speech in automatic lip sync process. By analyzing speech input, lip movements are determined from the speech signal. Another method provides virtual speaker capable of reading plain English text and rendering it in a form of speech accompanied by the appropriate facial gestures. Proposed statistical model for generating virtual speaker's facial gestures can be also applied as addition to lip synchronization process in order to obtain speech driven facial gesturing. In this case statistical model is triggered with the input speech prosody instead of lexical analysis of the input text.","PeriodicalId":238993,"journal":{"name":"ISPA 2005. Proceedings of the 4th International Symposium on Image and Signal Processing and Analysis, 2005.","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ISPA 2005. Proceedings of the 4th International Symposium on Image and Signal Processing and Analysis, 2005.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4304/jmm.1.1.62-68","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
We present two methods for automatic facial gesturing of graphically embodied animated agents. In one case, conversational agent is driven by speech in automatic lip sync process. By analyzing speech input, lip movements are determined from the speech signal. Another method provides virtual speaker capable of reading plain English text and rendering it in a form of speech accompanied by the appropriate facial gestures. Proposed statistical model for generating virtual speaker's facial gestures can be also applied as addition to lip synchronization process in order to obtain speech driven facial gesturing. In this case statistical model is triggered with the input speech prosody instead of lexical analysis of the input text.