动画面部与合成语音的同步

5th International Conference on Telecommunications in Modern Satellite, Cable and Broadcasting Service. TELSIKS 2001. Proceedings of Papers (Cat. No.01EX517) Pub Date : 2001-09-19 DOI:10.1109/TELSKS.2001.954885

S.M. Paunovic, M. Milosevic

{"title":"动画面部与合成语音的同步","authors":"S.M. Paunovic, M. Milosevic","doi":"10.1109/TELSKS.2001.954885","DOIUrl":null,"url":null,"abstract":"The creation of audible speech from computer readable text is called text-to-speech (TTS) synthesis. Next-generation TTS systems will have to convert machine-readable text into audible speech using different approach. The aim of research in this field is to increase the naturalness of speech synthesis significantly while maintaining good intelligibility. Synthetic realistic mouth motions matching the speech sounds not only give the perception that the image is talking, but can actually increase the intelligibility of the speech (increased redundancy). Also, there is a psychological moment: animated images (agents, avatars, virtual personae) can increase a user's patience while waiting on searches and database retrievals. The newest advance in the visual TTS (VTTS) is given.","PeriodicalId":253344,"journal":{"name":"5th International Conference on Telecommunications in Modern Satellite, Cable and Broadcasting Service. TELSIKS 2001. Proceedings of Papers (Cat. No.01EX517)","volume":"122 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2001-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Synchronization of an animated face with synthetic speech\",\"authors\":\"S.M. Paunovic, M. Milosevic\",\"doi\":\"10.1109/TELSKS.2001.954885\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The creation of audible speech from computer readable text is called text-to-speech (TTS) synthesis. Next-generation TTS systems will have to convert machine-readable text into audible speech using different approach. The aim of research in this field is to increase the naturalness of speech synthesis significantly while maintaining good intelligibility. Synthetic realistic mouth motions matching the speech sounds not only give the perception that the image is talking, but can actually increase the intelligibility of the speech (increased redundancy). Also, there is a psychological moment: animated images (agents, avatars, virtual personae) can increase a user's patience while waiting on searches and database retrievals. The newest advance in the visual TTS (VTTS) is given.\",\"PeriodicalId\":253344,\"journal\":{\"name\":\"5th International Conference on Telecommunications in Modern Satellite, Cable and Broadcasting Service. TELSIKS 2001. Proceedings of Papers (Cat. No.01EX517)\",\"volume\":\"122 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2001-09-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"5th International Conference on Telecommunications in Modern Satellite, Cable and Broadcasting Service. TELSIKS 2001. Proceedings of Papers (Cat. No.01EX517)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TELSKS.2001.954885\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"5th International Conference on Telecommunications in Modern Satellite, Cable and Broadcasting Service. TELSIKS 2001. Proceedings of Papers (Cat. No.01EX517)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TELSKS.2001.954885","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

从计算机可读的文本中创造可听的语音被称为文本到语音(TTS)合成。下一代TTS系统必须使用不同的方法将机器可读的文本转换为可听的语音。该领域的研究目标是在保持良好的可理解性的同时显著提高语音合成的自然度。与语音相匹配的合成逼真的嘴部运动不仅给人一种图像正在说话的感觉，而且实际上可以提高语音的可理解性(增加冗余)。此外，还有一个心理时刻:动画图像(代理、化身、虚拟人物)可以增加用户在等待搜索和数据库检索时的耐心。介绍了可视化TTS技术的最新进展。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Synchronization of an animated face with synthetic speech

The creation of audible speech from computer readable text is called text-to-speech (TTS) synthesis. Next-generation TTS systems will have to convert machine-readable text into audible speech using different approach. The aim of research in this field is to increase the naturalness of speech synthesis significantly while maintaining good intelligibility. Synthetic realistic mouth motions matching the speech sounds not only give the perception that the image is talking, but can actually increase the intelligibility of the speech (increased redundancy). Also, there is a psychological moment: animated images (agents, avatars, virtual personae) can increase a user's patience while waiting on searches and database retrievals. The newest advance in the visual TTS (VTTS) is given.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

5th International Conference on Telecommunications in Modern Satellite, Cable and Broadcasting Service. TELSIKS 2001. Proceedings of Papers (Cat. No.01EX517)

自引率

0.00%

发文量