电话语音数据语料库及使用该语料库的说话人独立识别系统的性能

Proceedings of 2nd IEEE Workshop on Interactive Voice Technology for Telecommunications Applications Pub Date : 1994-09-26 DOI:10.1109/IVTTA.1994.341535

T. Isobe, K. Murakami

{"title":"电话语音数据语料库及使用该语料库的说话人独立识别系统的性能","authors":"T. Isobe, K. Murakami","doi":"10.1109/IVTTA.1994.341535","DOIUrl":null,"url":null,"abstract":"The authors first describe the speech data corpus they collected from 400 male and 400 female subjects over the phone. They then compare the performances of two types of triphone model based speaker independent recognition systems, in which they used the corpus for training models and testing. One system uses a normal continuous mixture density HMM, and the other uses a CDHMM with a tree structure of 2,064 Gaussian distributions, which needs only one thirtieth of the Gaussian computation of a normal one. As a result, the system with the tree-structure CDHMM performed as well as 3% less than the system using the normal CDHMM. This shows that tree-structure CDHMM are useful for telephone speech recognition.<<ETX>>","PeriodicalId":435907,"journal":{"name":"Proceedings of 2nd IEEE Workshop on Interactive Voice Technology for Telecommunications Applications","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1994-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Telephone speech data corpus and performances of speaker independent recognition system using the corpus\",\"authors\":\"T. Isobe, K. Murakami\",\"doi\":\"10.1109/IVTTA.1994.341535\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The authors first describe the speech data corpus they collected from 400 male and 400 female subjects over the phone. They then compare the performances of two types of triphone model based speaker independent recognition systems, in which they used the corpus for training models and testing. One system uses a normal continuous mixture density HMM, and the other uses a CDHMM with a tree structure of 2,064 Gaussian distributions, which needs only one thirtieth of the Gaussian computation of a normal one. As a result, the system with the tree-structure CDHMM performed as well as 3% less than the system using the normal CDHMM. This shows that tree-structure CDHMM are useful for telephone speech recognition.<<ETX>>\",\"PeriodicalId\":435907,\"journal\":{\"name\":\"Proceedings of 2nd IEEE Workshop on Interactive Voice Technology for Telecommunications Applications\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1994-09-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of 2nd IEEE Workshop on Interactive Voice Technology for Telecommunications Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IVTTA.1994.341535\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of 2nd IEEE Workshop on Interactive Voice Technology for Telecommunications Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IVTTA.1994.341535","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

作者首先描述了他们通过电话从400名男性和400名女性受试者中收集的语音数据语料库。然后，他们比较了两种基于三联音模型的独立说话人识别系统的性能，在这两种系统中，他们使用语料库来训练模型和测试。一种系统使用正态连续混合密度HMM，另一种系统使用具有2064个高斯分布的树结构的CDHMM，其所需的高斯计算量仅为正态混合密度HMM的三十分之一。结果表明，使用树状结构CDHMM的系统比使用普通CDHMM的系统性能低3%。这表明树状结构CDHMM在电话语音识别中是有用的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Telephone speech data corpus and performances of speaker independent recognition system using the corpus

The authors first describe the speech data corpus they collected from 400 male and 400 female subjects over the phone. They then compare the performances of two types of triphone model based speaker independent recognition systems, in which they used the corpus for training models and testing. One system uses a normal continuous mixture density HMM, and the other uses a CDHMM with a tree structure of 2,064 Gaussian distributions, which needs only one thirtieth of the Gaussian computation of a normal one. As a result, the system with the tree-structure CDHMM performed as well as 3% less than the system using the normal CDHMM. This shows that tree-structure CDHMM are useful for telephone speech recognition.<>

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of 2nd IEEE Workshop on Interactive Voice Technology for Telecommunications Applications

自引率

0.00%

发文量