独立说话人语音识别中平滑群延迟频谱距离测度的评价

Taizo Umezaki Member, Harald Singer Member, Fumitada Itakura Member
{"title":"独立说话人语音识别中平滑群延迟频谱距离测度的评价","authors":"Taizo Umezaki Member, Harald Singer Member, Fumitada Itakura Member","doi":"10.1002/ECJC.4430741005","DOIUrl":null,"url":null,"abstract":"The smoothed group delay spectrum distance (SGDS) measure is evaluated in speaker-independent recognition experiments. First, the appropriate level of smoothing of the group delay spectrum (GDS) is investigated by adding noise, etc., to the input speech. Then a comparison with the speaker-dependent case is made. An experiment is reported in which, for low amplitude parts of speech (e.g., unvoiced speech), the standard (LPC) distance measure is used in the interframe distance calculation instead of the SGDS distance measure. This method prevents a loss of recognition accuracy due to too strong an emphasis on certain spectral elements and a consistently high recognition accuracy can be achieved. \n \n \n \nFinally, evaluate the SGDS distance measure is evaluated where the GDS is represented in the spectral domain as a discrete Fourier transform (DFT) of the LPC coefficients. In comparison to the SGDS which was calculated by weighting the LPC cepstrum co-efficients, computation time and memory space can be reduced without loss of recognition accuracy. Furthermore, a low bit quantization of the GDS is reported and a high recognition rate is achieved with only 32 bits per frame.","PeriodicalId":100407,"journal":{"name":"Electronics and Communications in Japan (Part III: Fundamental Electronic Science)","volume":"21 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2007-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Evaluation of smoothed group delay spectrum distance measure in speaker-independent speech recognition\",\"authors\":\"Taizo Umezaki Member, Harald Singer Member, Fumitada Itakura Member\",\"doi\":\"10.1002/ECJC.4430741005\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The smoothed group delay spectrum distance (SGDS) measure is evaluated in speaker-independent recognition experiments. First, the appropriate level of smoothing of the group delay spectrum (GDS) is investigated by adding noise, etc., to the input speech. Then a comparison with the speaker-dependent case is made. An experiment is reported in which, for low amplitude parts of speech (e.g., unvoiced speech), the standard (LPC) distance measure is used in the interframe distance calculation instead of the SGDS distance measure. This method prevents a loss of recognition accuracy due to too strong an emphasis on certain spectral elements and a consistently high recognition accuracy can be achieved. \\n \\n \\n \\nFinally, evaluate the SGDS distance measure is evaluated where the GDS is represented in the spectral domain as a discrete Fourier transform (DFT) of the LPC coefficients. In comparison to the SGDS which was calculated by weighting the LPC cepstrum co-efficients, computation time and memory space can be reduced without loss of recognition accuracy. Furthermore, a low bit quantization of the GDS is reported and a high recognition rate is achieved with only 32 bits per frame.\",\"PeriodicalId\":100407,\"journal\":{\"name\":\"Electronics and Communications in Japan (Part III: Fundamental Electronic Science)\",\"volume\":\"21 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-02-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Electronics and Communications in Japan (Part III: Fundamental Electronic Science)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1002/ECJC.4430741005\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Electronics and Communications in Japan (Part III: Fundamental Electronic Science)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/ECJC.4430741005","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在与说话人无关的识别实验中,对平滑组延迟频谱距离(SGDS)测度进行了评价。首先,通过在输入语音中加入噪声等因素来研究群延迟谱(GDS)的适当平滑程度。然后与依赖说话人的情况进行了比较。本文报道了一项实验,对低振幅语音部分(如未发音语音),在帧间距离计算中使用标准(LPC)距离度量代替SGDS距离度量。该方法防止了由于过于强调某些光谱元素而导致的识别精度损失,并且可以实现始终如一的高识别精度。最后,评估SGDS距离度量,其中GDS在谱域中表示为LPC系数的离散傅里叶变换(DFT)。与加权LPC倒谱系数计算SGDS相比,在不损失识别精度的前提下,减少了计算时间和存储空间。此外,本文还报道了GDS的低比特量化和高识别率,每帧只有32比特。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Evaluation of smoothed group delay spectrum distance measure in speaker-independent speech recognition
The smoothed group delay spectrum distance (SGDS) measure is evaluated in speaker-independent recognition experiments. First, the appropriate level of smoothing of the group delay spectrum (GDS) is investigated by adding noise, etc., to the input speech. Then a comparison with the speaker-dependent case is made. An experiment is reported in which, for low amplitude parts of speech (e.g., unvoiced speech), the standard (LPC) distance measure is used in the interframe distance calculation instead of the SGDS distance measure. This method prevents a loss of recognition accuracy due to too strong an emphasis on certain spectral elements and a consistently high recognition accuracy can be achieved. Finally, evaluate the SGDS distance measure is evaluated where the GDS is represented in the spectral domain as a discrete Fourier transform (DFT) of the LPC coefficients. In comparison to the SGDS which was calculated by weighting the LPC cepstrum co-efficients, computation time and memory space can be reduced without loss of recognition accuracy. Furthermore, a low bit quantization of the GDS is reported and a high recognition rate is achieved with only 32 bits per frame.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信