Heng Lu, Zhenhua Ling, Si Wei, Yu Hu, Lirong Dai, Ren-Hua Wang
{"title":"Heteronym Verification for Mandarin Speech Synthesis","authors":"Heng Lu, Zhenhua Ling, Si Wei, Yu Hu, Lirong Dai, Ren-Hua Wang","doi":"10.1109/CHINSL.2008.ECP.46","DOIUrl":null,"url":null,"abstract":"Accurate phonetic transcription of speech corpus is critical to high quality speech synthesis. In Mandarin text-to-speech (MTTS) system, one major problem of automatically labeling the database is the heteronym annotation. Because in Mandarin, there are some single-character words or multi-character words have more than one pronunciation. In this paper, a heteronym annotation verification method for MTTS database labeling is proposed. By training contextual dependent HMMs and calculating the log likelihood ratio, each heteronym in the database is assigned a confidence score and those below the threshold are selected for manual inspecting. We divide heteronyms in Mandarin into two categories and different features are used for each category. The result of our experiment on an artificial test set has shown that we can achieve EER (equal error rate) of 7.9% and 11.9% for these two categories. Further test on an actual database which contains a total of 36098 heteronyms has shown that the proposed method can find 89 of all 123 annotation errors by only inspecting 639 polyphones.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 6th International Symposium on Chinese Spoken Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CHINSL.2008.ECP.46","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Accurate phonetic transcription of speech corpus is critical to high quality speech synthesis. In Mandarin text-to-speech (MTTS) system, one major problem of automatically labeling the database is the heteronym annotation. Because in Mandarin, there are some single-character words or multi-character words have more than one pronunciation. In this paper, a heteronym annotation verification method for MTTS database labeling is proposed. By training contextual dependent HMMs and calculating the log likelihood ratio, each heteronym in the database is assigned a confidence score and those below the threshold are selected for manual inspecting. We divide heteronyms in Mandarin into two categories and different features are used for each category. The result of our experiment on an artificial test set has shown that we can achieve EER (equal error rate) of 7.9% and 11.9% for these two categories. Further test on an actual database which contains a total of 36098 heteronyms has shown that the proposed method can find 89 of all 123 annotation errors by only inspecting 639 polyphones.