利用源代码和系统特征提高手机识别精度

2015 International Conference on Signal Processing and Communication Engineering Systems Pub Date : 2015-03-12 DOI:10.1109/SPACES.2015.7058205

K. Manjunath, K. S. Rao, M. G. Reddy

{"title":"利用源代码和系统特征提高手机识别精度","authors":"K. Manjunath, K. S. Rao, M. G. Reddy","doi":"10.1109/SPACES.2015.7058205","DOIUrl":null,"url":null,"abstract":"The goal of this work is to improve phone recognition accuracy using combination of source and system features. As speech is produced by exciting time varying vocal tract system with time varying excitation, we want to explore both source and system components of speech production system for phone recognition. The excitation source information is derived by processing linear prediction residual of speech signal. Mel-frequency cepstral coefficient features are used for capturing vocal tract information. The Phone Recognition Systems (PRSs) are developed using hidden Markov models. The proposed PRSs are developed for English and an Indian language Bengali using TEVIIT and Phonetic, Prosodically Rich Transcribed speech corpora, respectively. We have also developed tandem PRSs using the phone posteriors obtained from feedforward neural networks. The tandem PRSs developed using combination of excitation source and system features, outperform the conventional tandem systems developed using system features alone.","PeriodicalId":432479,"journal":{"name":"2015 International Conference on Signal Processing and Communication Engineering Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Improvement of phone recognition accuracy using source and system features\",\"authors\":\"K. Manjunath, K. S. Rao, M. G. Reddy\",\"doi\":\"10.1109/SPACES.2015.7058205\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The goal of this work is to improve phone recognition accuracy using combination of source and system features. As speech is produced by exciting time varying vocal tract system with time varying excitation, we want to explore both source and system components of speech production system for phone recognition. The excitation source information is derived by processing linear prediction residual of speech signal. Mel-frequency cepstral coefficient features are used for capturing vocal tract information. The Phone Recognition Systems (PRSs) are developed using hidden Markov models. The proposed PRSs are developed for English and an Indian language Bengali using TEVIIT and Phonetic, Prosodically Rich Transcribed speech corpora, respectively. We have also developed tandem PRSs using the phone posteriors obtained from feedforward neural networks. The tandem PRSs developed using combination of excitation source and system features, outperform the conventional tandem systems developed using system features alone.\",\"PeriodicalId\":432479,\"journal\":{\"name\":\"2015 International Conference on Signal Processing and Communication Engineering Systems\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-03-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 International Conference on Signal Processing and Communication Engineering Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SPACES.2015.7058205\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on Signal Processing and Communication Engineering Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPACES.2015.7058205","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

本文的目标是利用源和系统特征的结合来提高手机识别的准确性。语音是由具有时变激励的时变声道系统产生的，我们想要探索用于电话识别的语音产生系统的来源和系统组成。通过对语音信号的线性预测残差进行处理，得到激励源信息。Mel-frequency倒谱系数特征用于捕捉声道信息。利用隐马尔可夫模型开发了手机识别系统。建议的prs分别使用TEVIIT和语音、韵律丰富的转录语音语料库为英语和印度语孟加拉语开发。我们还利用前馈神经网络获得的手机后验开发了串联prs。利用励磁源和系统特性相结合开发的串联PRSs，优于单独利用系统特性开发的传统串联系统。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Improvement of phone recognition accuracy using source and system features

The goal of this work is to improve phone recognition accuracy using combination of source and system features. As speech is produced by exciting time varying vocal tract system with time varying excitation, we want to explore both source and system components of speech production system for phone recognition. The excitation source information is derived by processing linear prediction residual of speech signal. Mel-frequency cepstral coefficient features are used for capturing vocal tract information. The Phone Recognition Systems (PRSs) are developed using hidden Markov models. The proposed PRSs are developed for English and an Indian language Bengali using TEVIIT and Phonetic, Prosodically Rich Transcribed speech corpora, respectively. We have also developed tandem PRSs using the phone posteriors obtained from feedforward neural networks. The tandem PRSs developed using combination of excitation source and system features, outperform the conventional tandem systems developed using system features alone.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2015 International Conference on Signal Processing and Communication Engineering Systems

自引率

0.00%

发文量