基于能量熵法的噪声环境下语音端点检测新算法

Q4 Engineering

Majlesi Journal of Electrical Engineering Pub Date : 2009-10-28 DOI:10.1234/mjee.v2i4.139

H. Dehghani

{"title":"基于能量熵法的噪声环境下语音端点检测新算法","authors":"H. Dehghani","doi":"10.1234/mjee.v2i4.139","DOIUrl":null,"url":null,"abstract":"Endpoint detection, which means distinguishing speech and non- speech segments, is considered as one of the key preprocessing operations in automatic speech recognition (ASR) systems. Usually the energy of speech signal and Zero Crossing Rate (ZCR), are used to locate the beginning and ending for an utterance. Both of these methods have been shown to be effective for endpoint detection. However, especially in a high noise environment they fail. In this paper, we integrate the modified Teager approach with the Energy-Entropy Features. In our new algorithm, the Teager Energy is used to determine crude endpoints, and the Energy-Entropy Features are used to make the final decision. The advantage of this method is that there is no need to estimate the background noise. Therefore, it is very helpful for environments when the beginning or ending noise is very strong or there is not enough “silence” at the beginning or at the end of the utterance. Experimental results on Farsi speech show that the accuracy of this algorithm is quite satisfactory and acceptable for speech endpoints detection.","PeriodicalId":37804,"journal":{"name":"Majlesi Journal of Electrical Engineering","volume":"2 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2009-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Novel Algorithm to Speech Endpoint Detection in Noisy Environments Based on Energy-Entropy Method\",\"authors\":\"H. Dehghani\",\"doi\":\"10.1234/mjee.v2i4.139\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Endpoint detection, which means distinguishing speech and non- speech segments, is considered as one of the key preprocessing operations in automatic speech recognition (ASR) systems. Usually the energy of speech signal and Zero Crossing Rate (ZCR), are used to locate the beginning and ending for an utterance. Both of these methods have been shown to be effective for endpoint detection. However, especially in a high noise environment they fail. In this paper, we integrate the modified Teager approach with the Energy-Entropy Features. In our new algorithm, the Teager Energy is used to determine crude endpoints, and the Energy-Entropy Features are used to make the final decision. The advantage of this method is that there is no need to estimate the background noise. Therefore, it is very helpful for environments when the beginning or ending noise is very strong or there is not enough “silence” at the beginning or at the end of the utterance. Experimental results on Farsi speech show that the accuracy of this algorithm is quite satisfactory and acceptable for speech endpoints detection.\",\"PeriodicalId\":37804,\"journal\":{\"name\":\"Majlesi Journal of Electrical Engineering\",\"volume\":\"2 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-10-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Majlesi Journal of Electrical Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1234/mjee.v2i4.139\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"Engineering\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Majlesi Journal of Electrical Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1234/mjee.v2i4.139","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Engineering","Score":null,"Total":0}

引用次数: 0

摘要

端点检测是自动语音识别(ASR)系统中关键的预处理操作之一，它可以区分语音和非语音段。通常使用语音信号的能量和过零率(Zero Crossing Rate, ZCR)来定位一个话语的开始和结束。这两种方法已被证明是有效的端点检测。然而，特别是在高噪声环境中，它们就失效了。在本文中，我们将改进的Teager方法与能量熵特征相结合。在新算法中，采用Teager能量来确定粗端点，利用能量熵特征进行最终决策。该方法的优点是不需要对背景噪声进行估计。因此，在开始或结束噪音非常强，或者在话语开始或结束时没有足够的“沉默”的环境下，它是非常有用的。对波斯语语音的实验结果表明，该算法对语音端点检测的准确性是令人满意的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Novel Algorithm to Speech Endpoint Detection in Noisy Environments Based on Energy-Entropy Method

Endpoint detection, which means distinguishing speech and non- speech segments, is considered as one of the key preprocessing operations in automatic speech recognition (ASR) systems. Usually the energy of speech signal and Zero Crossing Rate (ZCR), are used to locate the beginning and ending for an utterance. Both of these methods have been shown to be effective for endpoint detection. However, especially in a high noise environment they fail. In this paper, we integrate the modified Teager approach with the Energy-Entropy Features. In our new algorithm, the Teager Energy is used to determine crude endpoints, and the Energy-Entropy Features are used to make the final decision. The advantage of this method is that there is no need to estimate the background noise. Therefore, it is very helpful for environments when the beginning or ending noise is very strong or there is not enough “silence” at the beginning or at the end of the utterance. Experimental results on Farsi speech show that the accuracy of this algorithm is quite satisfactory and acceptable for speech endpoints detection.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Majlesi Journal of Electrical Engineering Engineering-Electrical and Electronic Engineering

CiteScore

1.20

自引率

0.00%

发文量

期刊介绍： The scope of Majlesi Journal of Electrcial Engineering (MJEE) is ranging from mathematical foundation to practical engineering design in all areas of electrical engineering. The editorial board is international and original unpublished papers are welcome from throughout the world. The journal is devoted primarily to research papers, but very high quality survey and tutorial papers are also published. There is no publication charge for the authors.