跨语言电话识别的专家判别产品声学映射

K. Sim
{"title":"跨语言电话识别的专家判别产品声学映射","authors":"K. Sim","doi":"10.1109/ASRU.2009.5372910","DOIUrl":null,"url":null,"abstract":"This paper presents a Product-of-Expert framework to perform probabilistic acoustic mapping for cross-lingual phone recognition. Under this framework, the posterior probabilities of the target HMM states are modelled as the weighted product of experts, where the experts or their weights are modelled as functions of the posterior probabilities of the source HMM states generated by a foreign phone recogniser. Careful choice of these functions leads to the Product-of-Posterior and Posterior Weighted Product-of-Expert models, which can be conveniently represented as 2-layer and 3-layer feed-forward neural networks respectively. Therefore, the commonly used error back-propagation method can be used to discriminatively train the model parameters. Experimental results are presented on the NTIMIT database using the Czech, Hungarian and Russian hybrid NN/HMM recognisers as the foreign phone recognisers to recognise English phones. With only about 15.6 minutes of training data, the best acoustic mapping model achieved 46.00% phone error rate, which is not far behind the 43.55% performance of the NN/HMM system trained directly on the full 3.31 hours of data.","PeriodicalId":292194,"journal":{"name":"2009 IEEE Workshop on Automatic Speech Recognition & Understanding","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"35","resultStr":"{\"title\":\"Discriminative Product-of-Expert acoustic mapping for cross-lingual phone recognition\",\"authors\":\"K. Sim\",\"doi\":\"10.1109/ASRU.2009.5372910\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a Product-of-Expert framework to perform probabilistic acoustic mapping for cross-lingual phone recognition. Under this framework, the posterior probabilities of the target HMM states are modelled as the weighted product of experts, where the experts or their weights are modelled as functions of the posterior probabilities of the source HMM states generated by a foreign phone recogniser. Careful choice of these functions leads to the Product-of-Posterior and Posterior Weighted Product-of-Expert models, which can be conveniently represented as 2-layer and 3-layer feed-forward neural networks respectively. Therefore, the commonly used error back-propagation method can be used to discriminatively train the model parameters. Experimental results are presented on the NTIMIT database using the Czech, Hungarian and Russian hybrid NN/HMM recognisers as the foreign phone recognisers to recognise English phones. With only about 15.6 minutes of training data, the best acoustic mapping model achieved 46.00% phone error rate, which is not far behind the 43.55% performance of the NN/HMM system trained directly on the full 3.31 hours of data.\",\"PeriodicalId\":292194,\"journal\":{\"name\":\"2009 IEEE Workshop on Automatic Speech Recognition & Understanding\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"35\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 IEEE Workshop on Automatic Speech Recognition & Understanding\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASRU.2009.5372910\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE Workshop on Automatic Speech Recognition & Understanding","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU.2009.5372910","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 35

摘要

本文提出了一个专家产品框架来执行跨语言电话识别的概率声学映射。在该框架下,目标HMM状态的后验概率被建模为专家的加权积,其中专家或其权重被建模为由国外电话识别器生成的源HMM状态的后验概率的函数。仔细选择这些函数可以得到后验产物和后验加权专家产物模型,它们可以方便地分别表示为2层和3层前馈神经网络。因此,可以采用常用的误差反向传播方法对模型参数进行判别训练。在NTIMIT数据库上使用捷克、匈牙利和俄罗斯混合NN/HMM识别器作为外文电话识别器进行英语电话识别的实验结果。仅用大约15.6分钟的训练数据,最佳声学映射模型的电话错误率达到46.00%,与直接训练完整3.31小时数据的NN/HMM系统43.55%的性能相差不远。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Discriminative Product-of-Expert acoustic mapping for cross-lingual phone recognition
This paper presents a Product-of-Expert framework to perform probabilistic acoustic mapping for cross-lingual phone recognition. Under this framework, the posterior probabilities of the target HMM states are modelled as the weighted product of experts, where the experts or their weights are modelled as functions of the posterior probabilities of the source HMM states generated by a foreign phone recogniser. Careful choice of these functions leads to the Product-of-Posterior and Posterior Weighted Product-of-Expert models, which can be conveniently represented as 2-layer and 3-layer feed-forward neural networks respectively. Therefore, the commonly used error back-propagation method can be used to discriminatively train the model parameters. Experimental results are presented on the NTIMIT database using the Czech, Hungarian and Russian hybrid NN/HMM recognisers as the foreign phone recognisers to recognise English phones. With only about 15.6 minutes of training data, the best acoustic mapping model achieved 46.00% phone error rate, which is not far behind the 43.55% performance of the NN/HMM system trained directly on the full 3.31 hours of data.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信