Guolei Jiang, Chunhong Liao, Kun Li, Pengfei Liu, Linying Jiang, H. Meng
{"title":"Automatic Speaker-level Pronunciation Assessment of L2 Speech Using Posterior Probabilities from Multiple Utterances","authors":"Guolei Jiang, Chunhong Liao, Kun Li, Pengfei Liu, Linying Jiang, H. Meng","doi":"10.1109/ISCSLP49672.2021.9362121","DOIUrl":null,"url":null,"abstract":"Evaluation of the level of accentedness is important for second language education, both in qualifying language teachers and in offering advice and feedback to the learners. Previous methods evaluated accentedness of a speaker based on a limited number of utterance(s) from the speaker in focus, which leads to biased/unstable results since sparse data cannot fully cover speaker-specific pronunciation errors. To enhance stability in evaluation, we investigate the use of speaker-level features and speaker-level neural networks trained on multiple utterances. Experimental results demonstrate that using speaker-level features and speaker-level models provide high accent classification accuracy comparable with human annotations. The proposed approach also enhances the stability of the evaluation results.","PeriodicalId":279828,"journal":{"name":"2021 12th International Symposium on Chinese Spoken Language Processing (ISCSLP)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 12th International Symposium on Chinese Spoken Language Processing (ISCSLP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCSLP49672.2021.9362121","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Evaluation of the level of accentedness is important for second language education, both in qualifying language teachers and in offering advice and feedback to the learners. Previous methods evaluated accentedness of a speaker based on a limited number of utterance(s) from the speaker in focus, which leads to biased/unstable results since sparse data cannot fully cover speaker-specific pronunciation errors. To enhance stability in evaluation, we investigate the use of speaker-level features and speaker-level neural networks trained on multiple utterances. Experimental results demonstrate that using speaker-level features and speaker-level models provide high accent classification accuracy comparable with human annotations. The proposed approach also enhances the stability of the evaluation results.