{"title":"电话语音数据语料库及使用该语料库的说话人独立识别系统的性能","authors":"T. Isobe, K. Murakami","doi":"10.1109/IVTTA.1994.341535","DOIUrl":null,"url":null,"abstract":"The authors first describe the speech data corpus they collected from 400 male and 400 female subjects over the phone. They then compare the performances of two types of triphone model based speaker independent recognition systems, in which they used the corpus for training models and testing. One system uses a normal continuous mixture density HMM, and the other uses a CDHMM with a tree structure of 2,064 Gaussian distributions, which needs only one thirtieth of the Gaussian computation of a normal one. As a result, the system with the tree-structure CDHMM performed as well as 3% less than the system using the normal CDHMM. This shows that tree-structure CDHMM are useful for telephone speech recognition.<<ETX>>","PeriodicalId":435907,"journal":{"name":"Proceedings of 2nd IEEE Workshop on Interactive Voice Technology for Telecommunications Applications","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1994-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Telephone speech data corpus and performances of speaker independent recognition system using the corpus\",\"authors\":\"T. Isobe, K. Murakami\",\"doi\":\"10.1109/IVTTA.1994.341535\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The authors first describe the speech data corpus they collected from 400 male and 400 female subjects over the phone. They then compare the performances of two types of triphone model based speaker independent recognition systems, in which they used the corpus for training models and testing. One system uses a normal continuous mixture density HMM, and the other uses a CDHMM with a tree structure of 2,064 Gaussian distributions, which needs only one thirtieth of the Gaussian computation of a normal one. As a result, the system with the tree-structure CDHMM performed as well as 3% less than the system using the normal CDHMM. This shows that tree-structure CDHMM are useful for telephone speech recognition.<<ETX>>\",\"PeriodicalId\":435907,\"journal\":{\"name\":\"Proceedings of 2nd IEEE Workshop on Interactive Voice Technology for Telecommunications Applications\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1994-09-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of 2nd IEEE Workshop on Interactive Voice Technology for Telecommunications Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IVTTA.1994.341535\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of 2nd IEEE Workshop on Interactive Voice Technology for Telecommunications Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IVTTA.1994.341535","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Telephone speech data corpus and performances of speaker independent recognition system using the corpus
The authors first describe the speech data corpus they collected from 400 male and 400 female subjects over the phone. They then compare the performances of two types of triphone model based speaker independent recognition systems, in which they used the corpus for training models and testing. One system uses a normal continuous mixture density HMM, and the other uses a CDHMM with a tree structure of 2,064 Gaussian distributions, which needs only one thirtieth of the Gaussian computation of a normal one. As a result, the system with the tree-structure CDHMM performed as well as 3% less than the system using the normal CDHMM. This shows that tree-structure CDHMM are useful for telephone speech recognition.<>