{"title":"不利环境下基于DFT和基于dwt的语音/非语音检测的比较","authors":"T. V. Pham, G. Kubin","doi":"10.1109/ATC.2011.6027490","DOIUrl":null,"url":null,"abstract":"The goal of this paper is to evaluate the wavelet/frequency-based voice activity detection (VAD) algorithms under harsh conditions. A new frequency-based speech classifier has been developed based on a single subband distance feature in cooperating with adaptive percentile filter. Experimental results in clean, noisy and reverberant environments are provided. Results show that: (i) the group of algorithms exploiting the subband power distance feature mostly outperforms the state-of-the-art VAD standardized for the G. 729 B, the ETSI AFE ES 202 050 in terms of classification measures; (ii) the robustness of the model-based VAD methods still holds in a completely mismatched reverberant environment.","PeriodicalId":221905,"journal":{"name":"The 2011 International Conference on Advanced Technologies for Communications (ATC 2011)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Comparison between DFT- and DWT-based speech/non-speech detection for adverse environments\",\"authors\":\"T. V. Pham, G. Kubin\",\"doi\":\"10.1109/ATC.2011.6027490\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The goal of this paper is to evaluate the wavelet/frequency-based voice activity detection (VAD) algorithms under harsh conditions. A new frequency-based speech classifier has been developed based on a single subband distance feature in cooperating with adaptive percentile filter. Experimental results in clean, noisy and reverberant environments are provided. Results show that: (i) the group of algorithms exploiting the subband power distance feature mostly outperforms the state-of-the-art VAD standardized for the G. 729 B, the ETSI AFE ES 202 050 in terms of classification measures; (ii) the robustness of the model-based VAD methods still holds in a completely mismatched reverberant environment.\",\"PeriodicalId\":221905,\"journal\":{\"name\":\"The 2011 International Conference on Advanced Technologies for Communications (ATC 2011)\",\"volume\":\"45 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-09-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The 2011 International Conference on Advanced Technologies for Communications (ATC 2011)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ATC.2011.6027490\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The 2011 International Conference on Advanced Technologies for Communications (ATC 2011)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ATC.2011.6027490","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
摘要
本文的目的是评估在恶劣条件下基于小波/频率的语音活动检测(VAD)算法。基于单子带距离特征,结合自适应百分位滤波器,提出了一种新的基于频率的语音分类器。给出了洁净、噪声和混响环境下的实验结果。结果表明:(i)利用子带功率距离特征的算法组在分类度量方面大多优于最先进的针对G. 729 B、ETSI AFE ES 202 050标准化的VAD;(ii)基于模型的VAD方法在完全不匹配的混响环境下仍然具有鲁棒性。
Comparison between DFT- and DWT-based speech/non-speech detection for adverse environments
The goal of this paper is to evaluate the wavelet/frequency-based voice activity detection (VAD) algorithms under harsh conditions. A new frequency-based speech classifier has been developed based on a single subband distance feature in cooperating with adaptive percentile filter. Experimental results in clean, noisy and reverberant environments are provided. Results show that: (i) the group of algorithms exploiting the subband power distance feature mostly outperforms the state-of-the-art VAD standardized for the G. 729 B, the ETSI AFE ES 202 050 in terms of classification measures; (ii) the robustness of the model-based VAD methods still holds in a completely mismatched reverberant environment.