{"title":"A high performance Mandarin digit recognizer","authors":"Zan Bo, Liu Juan, Gang Peng, William S-Y. Wang","doi":"10.1109/ISSPA.1999.815751","DOIUrl":null,"url":null,"abstract":"Digit recognition is important in some applications such as automated banking systems or database information retrieving systems. To design a high performance Mandarin digit recognizer, a Mandarin phonetic question set was first carefully designed and then used to cluster 846 gender dependent cross word triphones. To model the fine differences in high frequency region of Mandarin initials, inverse mel-frequency warping was used to calculate the IMFCC feature. The IMFCC feature was shown to be quite effective in recovering the substitution errors caused by similarity of the Mandarin initials. Combined with triphone duration modeling, the recognizer produced 98.81% word accuracy rate and 95.20% sentence correct rate.","PeriodicalId":302569,"journal":{"name":"ISSPA '99. Proceedings of the Fifth International Symposium on Signal Processing and its Applications (IEEE Cat. No.99EX359)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1999-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ISSPA '99. Proceedings of the Fifth International Symposium on Signal Processing and its Applications (IEEE Cat. No.99EX359)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISSPA.1999.815751","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Digit recognition is important in some applications such as automated banking systems or database information retrieving systems. To design a high performance Mandarin digit recognizer, a Mandarin phonetic question set was first carefully designed and then used to cluster 846 gender dependent cross word triphones. To model the fine differences in high frequency region of Mandarin initials, inverse mel-frequency warping was used to calculate the IMFCC feature. The IMFCC feature was shown to be quite effective in recovering the substitution errors caused by similarity of the Mandarin initials. Combined with triphone duration modeling, the recognizer produced 98.81% word accuracy rate and 95.20% sentence correct rate.