{"title":"一个高性能的中文数字识别器","authors":"Zan Bo, Liu Juan, Gang Peng, William S-Y. Wang","doi":"10.1109/ISSPA.1999.815751","DOIUrl":null,"url":null,"abstract":"Digit recognition is important in some applications such as automated banking systems or database information retrieving systems. To design a high performance Mandarin digit recognizer, a Mandarin phonetic question set was first carefully designed and then used to cluster 846 gender dependent cross word triphones. To model the fine differences in high frequency region of Mandarin initials, inverse mel-frequency warping was used to calculate the IMFCC feature. The IMFCC feature was shown to be quite effective in recovering the substitution errors caused by similarity of the Mandarin initials. Combined with triphone duration modeling, the recognizer produced 98.81% word accuracy rate and 95.20% sentence correct rate.","PeriodicalId":302569,"journal":{"name":"ISSPA '99. Proceedings of the Fifth International Symposium on Signal Processing and its Applications (IEEE Cat. No.99EX359)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1999-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"A high performance Mandarin digit recognizer\",\"authors\":\"Zan Bo, Liu Juan, Gang Peng, William S-Y. Wang\",\"doi\":\"10.1109/ISSPA.1999.815751\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Digit recognition is important in some applications such as automated banking systems or database information retrieving systems. To design a high performance Mandarin digit recognizer, a Mandarin phonetic question set was first carefully designed and then used to cluster 846 gender dependent cross word triphones. To model the fine differences in high frequency region of Mandarin initials, inverse mel-frequency warping was used to calculate the IMFCC feature. The IMFCC feature was shown to be quite effective in recovering the substitution errors caused by similarity of the Mandarin initials. Combined with triphone duration modeling, the recognizer produced 98.81% word accuracy rate and 95.20% sentence correct rate.\",\"PeriodicalId\":302569,\"journal\":{\"name\":\"ISSPA '99. Proceedings of the Fifth International Symposium on Signal Processing and its Applications (IEEE Cat. No.99EX359)\",\"volume\":\"15 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1999-08-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ISSPA '99. Proceedings of the Fifth International Symposium on Signal Processing and its Applications (IEEE Cat. No.99EX359)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISSPA.1999.815751\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ISSPA '99. Proceedings of the Fifth International Symposium on Signal Processing and its Applications (IEEE Cat. No.99EX359)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISSPA.1999.815751","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Digit recognition is important in some applications such as automated banking systems or database information retrieving systems. To design a high performance Mandarin digit recognizer, a Mandarin phonetic question set was first carefully designed and then used to cluster 846 gender dependent cross word triphones. To model the fine differences in high frequency region of Mandarin initials, inverse mel-frequency warping was used to calculate the IMFCC feature. The IMFCC feature was shown to be quite effective in recovering the substitution errors caused by similarity of the Mandarin initials. Combined with triphone duration modeling, the recognizer produced 98.81% word accuracy rate and 95.20% sentence correct rate.