两种新的基于FDLP的语音识别特征提取方法

2010 5th International Symposium on Telecommunications Pub Date : 2010-12-01 DOI:10.1109/ISTEL.2010.5734095

Y. Shekofteh, F. Almasganj, Ahmadreza Rezaei, M. M. Goodarzi

{"title":"两种新的基于FDLP的语音识别特征提取方法","authors":"Y. Shekofteh, F. Almasganj, Ahmadreza Rezaei, M. M. Goodarzi","doi":"10.1109/ISTEL.2010.5734095","DOIUrl":null,"url":null,"abstract":"In conventional automatic speech recognition systems, linguistic information of the speech signal are usually acquired from short-time frames about 10–30 ms. In this paper we have proposed two novel methods extracting the long-term information of the speech signal. Both of the methods are based on “sub-band FDLP” which divides the long-time frame of signal into several sub-bands. Using the MFCC algorithm, we are able to represent the long-term temporal features of the each sub-band. Our results show that the proposed methods could improve the recognition ratio by %1.73. The proposed methods were evaluated using the FarsDat database and the method's robustness against different conditions of noise was experimented.","PeriodicalId":306663,"journal":{"name":"2010 5th International Symposium on Telecommunications","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Two novel FDLP based feature extraction methods for improvement of speech recognition\",\"authors\":\"Y. Shekofteh, F. Almasganj, Ahmadreza Rezaei, M. M. Goodarzi\",\"doi\":\"10.1109/ISTEL.2010.5734095\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In conventional automatic speech recognition systems, linguistic information of the speech signal are usually acquired from short-time frames about 10–30 ms. In this paper we have proposed two novel methods extracting the long-term information of the speech signal. Both of the methods are based on “sub-band FDLP” which divides the long-time frame of signal into several sub-bands. Using the MFCC algorithm, we are able to represent the long-term temporal features of the each sub-band. Our results show that the proposed methods could improve the recognition ratio by %1.73. The proposed methods were evaluated using the FarsDat database and the method's robustness against different conditions of noise was experimented.\",\"PeriodicalId\":306663,\"journal\":{\"name\":\"2010 5th International Symposium on Telecommunications\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 5th International Symposium on Telecommunications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISTEL.2010.5734095\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 5th International Symposium on Telecommunications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISTEL.2010.5734095","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

在传统的自动语音识别系统中，语音信号的语言信息通常是在10 ~ 30ms的短时间帧内获取的。本文提出了两种提取语音信号长时信息的新方法。这两种方法都是基于“子带FDLP”，将信号的长帧分割成几个子带。使用MFCC算法，我们能够表示每个子带的长期时间特征。结果表明，该方法的识别率提高了%1.73。利用FarsDat数据库对所提方法进行了评价，并对该方法在不同噪声条件下的鲁棒性进行了实验。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Two novel FDLP based feature extraction methods for improvement of speech recognition

In conventional automatic speech recognition systems, linguistic information of the speech signal are usually acquired from short-time frames about 10–30 ms. In this paper we have proposed two novel methods extracting the long-term information of the speech signal. Both of the methods are based on “sub-band FDLP” which divides the long-time frame of signal into several sub-bands. Using the MFCC algorithm, we are able to represent the long-term temporal features of the each sub-band. Our results show that the proposed methods could improve the recognition ratio by %1.73. The proposed methods were evaluated using the FarsDat database and the method's robustness against different conditions of noise was experimented.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2010 5th International Symposium on Telecommunications

自引率

0.00%

发文量