基于梯度下降RBF网络的食道语音孤立词识别

P. Malathi, G. Suresh
{"title":"基于梯度下降RBF网络的食道语音孤立词识别","authors":"P. Malathi, G. Suresh","doi":"10.1109/CNT.2014.7062749","DOIUrl":null,"url":null,"abstract":"Speech signal can be represented as a combination of acoustic parameters extracted from the speech signal. The parameter vectors are assumed to be the constituents of the speech signal over a specified duration during which it is stationary. Typical representations are Mel Frequency Cepstral Coefficients, Linear Prediction Coefficients etc. The process of isolated word recognition involves the mapping of these parameters with speech but it cannot because there are large variations in the realized speech waveform due to speaker variability, modulation, context, etc. The parametric speech vectors corresponding to each vector is modeled using Gaussian Mixture Model and its distribution is observed. The Expectation Maximisation algorithm is used in the Radial Basis Function network to best fit the test vector. The gradient descent algorithm applied on Radial Basis Function Neural Network is proposed to approximate the functions which have high non-linear order. The learning rates of the network are made proportional to the probability densities obtained from the Gaussian Mixture Model. Isolated words of esophageal speech appear to be recognized better in this method compared to previous methods since it consists of non linear components.","PeriodicalId":347883,"journal":{"name":"2014 International Conference on Communication and Network Technologies","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Recognition of isolated words of esophageal speech using GMM and gradient descent RBF networks\",\"authors\":\"P. Malathi, G. Suresh\",\"doi\":\"10.1109/CNT.2014.7062749\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Speech signal can be represented as a combination of acoustic parameters extracted from the speech signal. The parameter vectors are assumed to be the constituents of the speech signal over a specified duration during which it is stationary. Typical representations are Mel Frequency Cepstral Coefficients, Linear Prediction Coefficients etc. The process of isolated word recognition involves the mapping of these parameters with speech but it cannot because there are large variations in the realized speech waveform due to speaker variability, modulation, context, etc. The parametric speech vectors corresponding to each vector is modeled using Gaussian Mixture Model and its distribution is observed. The Expectation Maximisation algorithm is used in the Radial Basis Function network to best fit the test vector. The gradient descent algorithm applied on Radial Basis Function Neural Network is proposed to approximate the functions which have high non-linear order. The learning rates of the network are made proportional to the probability densities obtained from the Gaussian Mixture Model. Isolated words of esophageal speech appear to be recognized better in this method compared to previous methods since it consists of non linear components.\",\"PeriodicalId\":347883,\"journal\":{\"name\":\"2014 International Conference on Communication and Network Technologies\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 International Conference on Communication and Network Technologies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CNT.2014.7062749\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 International Conference on Communication and Network Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CNT.2014.7062749","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

语音信号可以表示为从语音信号中提取的声学参数的组合。参数向量被假定为语音信号在特定的平稳持续时间内的成分。典型的表示有Mel频率倒谱系数、线性预测系数等。孤立词识别的过程涉及到这些参数与语音的映射,但由于说话人的变化、调制、上下文等因素,实现的语音波形存在很大的变化,因此无法实现。利用高斯混合模型对每个向量对应的参数语音向量进行建模,并观察其分布。在径向基函数网络中使用期望最大化算法来最佳拟合测试向量。提出了将梯度下降算法应用于径向基函数神经网络来逼近非线性阶数较高的函数。网络的学习率与高斯混合模型的概率密度成正比。由于该方法由非线性成分组成,因此与以前的方法相比,该方法可以更好地识别食道言语的孤立词。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Recognition of isolated words of esophageal speech using GMM and gradient descent RBF networks
Speech signal can be represented as a combination of acoustic parameters extracted from the speech signal. The parameter vectors are assumed to be the constituents of the speech signal over a specified duration during which it is stationary. Typical representations are Mel Frequency Cepstral Coefficients, Linear Prediction Coefficients etc. The process of isolated word recognition involves the mapping of these parameters with speech but it cannot because there are large variations in the realized speech waveform due to speaker variability, modulation, context, etc. The parametric speech vectors corresponding to each vector is modeled using Gaussian Mixture Model and its distribution is observed. The Expectation Maximisation algorithm is used in the Radial Basis Function network to best fit the test vector. The gradient descent algorithm applied on Radial Basis Function Neural Network is proposed to approximate the functions which have high non-linear order. The learning rates of the network are made proportional to the probability densities obtained from the Gaussian Mixture Model. Isolated words of esophageal speech appear to be recognized better in this method compared to previous methods since it consists of non linear components.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信