2008 6th International Symposium on Chinese Spoken Language Processing最新文献_第2页

Decision Fusion for Improving Mispronunciation Detection Using Language Transfer Knowledge and Phoneme-Dependent Pronunciation Scoring 基于语言迁移知识和音位依赖语音评分的决策融合改进发音错误检测

2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.18

W. Lo, Alissa M. Harrison, H. Meng, Lan Wang

{"title":"Decision Fusion for Improving Mispronunciation Detection Using Language Transfer Knowledge and Phoneme-Dependent Pronunciation Scoring","authors":"W. Lo, Alissa M. Harrison, H. Meng, Lan Wang","doi":"10.1109/CHINSL.2008.ECP.18","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.18","url":null,"abstract":"Application of linguistic knowledge of language transfer to automatic speech recognition (ASR) technology can enhance mispronunciation detection performance in computer-aided pronunciation training (CAPT). This is achieved by pinpointing salient pronunciation errors made by second language learners. In this work, we propose to apply decision fusion for further improvement in mispronunciation detection performance. Detection decision from the linguistically-motivated detection, which applies language transfer knowledge, is used as the basis. Back off to posterior probability based pronunciation scoring with phoneme-dependent thresholds is employed when the basis is \"less-reliable\". Fusion can help combat problems such as incomplete coverage of linguistic knowledge as well as the imperfection of acoustic models in ASR. Our fusion strategy can maintain the diagnosis capability of the linguistically-motivated approach while achieve a major boost in detection performance. Experimental results show that decision fusion can achieve relative improvement in mispronunciation detection of up to 30% reduction in total number of decision errors.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131694708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Improvements on Mel-Frequency Cepstrum Minimum-Mean-Square-Error Noise Suppressor for Robust Speech Recognition 鲁棒语音识别中mel频率倒频谱最小均方误差噪声抑制器的改进

2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.29

Dong Yu, L. Deng, Jian Wu, Y. Gong, A. Acero

{"title":"Improvements on Mel-Frequency Cepstrum Minimum-Mean-Square-Error Noise Suppressor for Robust Speech Recognition","authors":"Dong Yu, L. Deng, Jian Wu, Y. Gong, A. Acero","doi":"10.1109/CHINSL.2008.ECP.29","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.29","url":null,"abstract":"Recently we have developed a non-linear feature-domain noise reduction algorithm based on the minimum mean square error (MMSE) criterion on Mel-frequency cepstra (MFCC) for environment-robust speech recognition. Our novel algorithm operates on the power spectral magnitude of the filter-bank's outputs and outperforms the log-MMSE spectral amplitude noise suppressor proposed by Ephraim and Malah in both recognition accuracy and efficiency as demonstrated on the Aurora-3 corpora. This paper serves two purposes. First, we show that the algorithm is effective on large vocabulary tasks with tri-phone acoustic models. Second, we report improvements on the suppression rule of the original MFCC-MMSE noise suppressor by smoothing the gain over the previous frames to prevent the abrupt change of the gain over frames and adjusting gain function based on the noise power so that the suppression is aggressive when the noise level is high and conservative when the noise level is low. We also propose an efficient and effective parameter tuning algorithm named step-adaptive discriminative learning algorithm (SADLA) to adjust the parameters used by the noise tracker and the suppressor. We observed a 46% relative word error (WER) reduction on an in-house large-vocabulary noisy speech database with a clean trained model, which translates into a 16% relative WER reduction over the original MFCC-MMSE noise suppressor, and 6% relative WER reduction on the Aurora-3 corpora over our original MFCC-MMSE algorithm or 30% relative WER reduction over the CMN baseline.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"36 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123217728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

The Pitch Analysis of Imperative Sentences in Standard Chinese 普通话祈使句的音高分析

2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.78

Jia Sun, J. Lu, Ai-jun Li, Yuan Jia

引用次数: 1

Using Reference to Tune Language Model for Detection of Reading Miscues 利用参考调校语言模型检测阅读错误

2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.87

Changliang Liu, Fuping Pan, Fengpei Ge, Bin Dong, Yonghong Yan

引用次数: 0

Investigation on Adaptation Using Different Discriminative Training Criteria Based Linear Regression and Map 基于线性回归和映射的不同判别训练准则的适应性研究

2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.35

Bo Zhu, Zhijie Yan, Yu Hu, Zhiguo Wang, Lirong Dai, Ren-Hua Wang

引用次数: 2

A Two-Stage Algorithm for Multi-Speaker Identification System 多说话人识别系统的两阶段算法

2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.52

Yong Guan, Wenju Liu

引用次数: 3

An Improvement for Training Efficiency of Semi-Tied Covariance 半捆绑协方差训练效率的改进

2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.62

Sibao Chen, Yu Hu, B. Luo, Ren-Hua Wang

引用次数: 0

A Perceptual Study of Approximated Cantonese Tone Contours 近似粤语声调轮廓的感知研究

2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.24

Yujia Li, Tan Lee

引用次数: 2

Improving Automatic Evaluation of Mandarin Pronunciation with Speaker Adaptive Training (SAT) and MLLR Speaker Adaption 用说话者自适应训练(SAT)和MLLR说话者自适应改进普通话语音自动评价

2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.21

Chao Huang, Feng Zhang, F. Soong

引用次数: 8

A Synchronous Method for Automatic Scoring of Language Learning 一种语言学习自动评分的同步方法

2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.86

Bin Dong, Yonghong Yan

引用次数: 2