使用语音注册模型的说话人训练识别

IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01. Pub Date : 2001-12-09 DOI:10.1109/ASRU.2001.1034589

V. Yanhoucke, M. Hochberg, C. Leggetter

{"title":"使用语音注册模型的说话人训练识别","authors":"V. Yanhoucke, M. Hochberg, C. Leggetter","doi":"10.1109/ASRU.2001.1034589","DOIUrl":null,"url":null,"abstract":"We introduce a method for performing speaker-trained recognition based on context-dependent allophone models from a large-vocabulary, speaker-independent recognition system. A set of speaker-enrollment templates is selected from the context-dependent allophone models. These templates are used to build representations of the speaker-enrolled utterances. The advantages of this approach include improved performance and portability of the enrollments across different acoustic models. We describe the approach used to select the enrollment templates and how to apply them to speaker-trained recognition. The approach has been evaluated on an over-the-telephone, voice-activated dialing task and shows significant performance improvements over techniques based on context-independent phone models or general acoustic model templates. In addition, the portability of enrollments from one model set to another is shown to result in almost no performance degradation.","PeriodicalId":118671,"journal":{"name":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2001-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Speaker-trained recognition using allophonic enrollment models\",\"authors\":\"V. Yanhoucke, M. Hochberg, C. Leggetter\",\"doi\":\"10.1109/ASRU.2001.1034589\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We introduce a method for performing speaker-trained recognition based on context-dependent allophone models from a large-vocabulary, speaker-independent recognition system. A set of speaker-enrollment templates is selected from the context-dependent allophone models. These templates are used to build representations of the speaker-enrolled utterances. The advantages of this approach include improved performance and portability of the enrollments across different acoustic models. We describe the approach used to select the enrollment templates and how to apply them to speaker-trained recognition. The approach has been evaluated on an over-the-telephone, voice-activated dialing task and shows significant performance improvements over techniques based on context-independent phone models or general acoustic model templates. In addition, the portability of enrollments from one model set to another is shown to result in almost no performance degradation.\",\"PeriodicalId\":118671,\"journal\":{\"name\":\"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2001-12-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASRU.2001.1034589\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU.2001.1034589","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

我们介绍了一种基于上下文相关的说话人训练识别方法，该方法来自一个大词汇量、说话人独立的识别系统。从上下文相关的变体模型中选择一组说话人注册模板。这些模板用于构建说话人登记的话语的表示。这种方法的优点包括改进性能和跨不同声学模型登记的可移植性。我们描述了用于选择注册模板的方法，以及如何将它们应用于演讲者训练的识别。该方法已在电话语音激活拨号任务中进行了评估，与基于上下文无关的电话模型或一般声学模型模板的技术相比，显示出显著的性能改进。此外，从一个模型集登记到另一个模型集的可移植性几乎不会导致性能下降。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Speaker-trained recognition using allophonic enrollment models

We introduce a method for performing speaker-trained recognition based on context-dependent allophone models from a large-vocabulary, speaker-independent recognition system. A set of speaker-enrollment templates is selected from the context-dependent allophone models. These templates are used to build representations of the speaker-enrolled utterances. The advantages of this approach include improved performance and portability of the enrollments across different acoustic models. We describe the approach used to select the enrollment templates and how to apply them to speaker-trained recognition. The approach has been evaluated on an over-the-telephone, voice-activated dialing task and shows significant performance improvements over techniques based on context-independent phone models or general acoustic model templates. In addition, the portability of enrollments from one model set to another is shown to result in almost no performance degradation.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.

自引率

0.00%

发文量