{"title":"基于TAM-BLSTM的二语学习者普通话连续语音声调错误检测","authors":"Yizhi Wu, Tong Guan","doi":"10.1117/12.2639121","DOIUrl":null,"url":null,"abstract":"To effectively help second language (L2) Chinese learners to produce tones correctly in computer assisted language learning (CALL), tone recognition of continuous speech is necessary. Because of the complex tone variation in continuous speech, this paper proposed TAM-BLSTM tone recognition model. Firstly, the generation model, target approximation model (TAM) is used to simulate fundamental frequency (f0) from original f0 contour in the unit of prosodic words, and the TAM parameters for each Chinese character are derived. Then BLSTM model with attention mechanism is set up with input feature of the TAM parameters and basic acoustic features, such as statistical f0 parameters, vowel duration, to solve the problem of tone detection of Mandarin continuous speech. Finally, the trained tone detection model is applied to the tone error detection of the L2 learners. The experimental results with Biaobei corpus show that the accuracy of the feature set combined with TAM parameters is 2.3% higher than that of using basic acoustic features alone, and the overall accuracy of ATT-BLSTM network model is higher than that based on ATT-LSTM.","PeriodicalId":336892,"journal":{"name":"Neural Networks, Information and Communication Engineering","volume":"81 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Tone error detection of continuous Mandarin speech for L2 learners based on TAM-BLSTM\",\"authors\":\"Yizhi Wu, Tong Guan\",\"doi\":\"10.1117/12.2639121\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To effectively help second language (L2) Chinese learners to produce tones correctly in computer assisted language learning (CALL), tone recognition of continuous speech is necessary. Because of the complex tone variation in continuous speech, this paper proposed TAM-BLSTM tone recognition model. Firstly, the generation model, target approximation model (TAM) is used to simulate fundamental frequency (f0) from original f0 contour in the unit of prosodic words, and the TAM parameters for each Chinese character are derived. Then BLSTM model with attention mechanism is set up with input feature of the TAM parameters and basic acoustic features, such as statistical f0 parameters, vowel duration, to solve the problem of tone detection of Mandarin continuous speech. Finally, the trained tone detection model is applied to the tone error detection of the L2 learners. The experimental results with Biaobei corpus show that the accuracy of the feature set combined with TAM parameters is 2.3% higher than that of using basic acoustic features alone, and the overall accuracy of ATT-BLSTM network model is higher than that based on ATT-LSTM.\",\"PeriodicalId\":336892,\"journal\":{\"name\":\"Neural Networks, Information and Communication Engineering\",\"volume\":\"81 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Networks, Information and Communication Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1117/12.2639121\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks, Information and Communication Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.2639121","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Tone error detection of continuous Mandarin speech for L2 learners based on TAM-BLSTM
To effectively help second language (L2) Chinese learners to produce tones correctly in computer assisted language learning (CALL), tone recognition of continuous speech is necessary. Because of the complex tone variation in continuous speech, this paper proposed TAM-BLSTM tone recognition model. Firstly, the generation model, target approximation model (TAM) is used to simulate fundamental frequency (f0) from original f0 contour in the unit of prosodic words, and the TAM parameters for each Chinese character are derived. Then BLSTM model with attention mechanism is set up with input feature of the TAM parameters and basic acoustic features, such as statistical f0 parameters, vowel duration, to solve the problem of tone detection of Mandarin continuous speech. Finally, the trained tone detection model is applied to the tone error detection of the L2 learners. The experimental results with Biaobei corpus show that the accuracy of the feature set combined with TAM parameters is 2.3% higher than that of using basic acoustic features alone, and the overall accuracy of ATT-BLSTM network model is higher than that based on ATT-LSTM.