{"title":"Improving noise estimation with RAPT pitch voice activity detection under low SNR condition","authors":"Supasit Chuwatthananurux, Dittaya Wanvarie","doi":"10.1109/KST.2016.7440486","DOIUrl":null,"url":null,"abstract":"Noise spectrum estimation is a fundamental component of speech enhancement and speech recognition systems. In this paper, we present pitch voice activity detection for noise estimation in the input with the low signal-to-noise ratio (SNR). The noise power spectrum is approximated by the minimum level of power over the period of the mixed signal. Since the noise level may not be stationary, the algorithm should regularly update the estimation. However, when a period contains a speech signal, the spectral power is rather high, and the noise level tends to be overestimated. To avoid this problem, we firstly use voice activity detector to detect the speech presence in the signal period. We propose that the pitch information is efficient in identifying speech activity in the mixed signal even under a low SNR condition. We adopt the Robust Algorithm for Pitch Tracking (RAPT) together with the ratio between the frame power and minimum spectral power to classify voice activity in the input frame. Under low SNR condition, the experimental result showed that the proposed algorithm is effective. The proposed algorithm achieves lower estimation errors when compared to Continuous spectrum minima tracking, Minima Control Recursive Averaging (MCRA) and the estimation based solely on the spectral power ratio.","PeriodicalId":350687,"journal":{"name":"2016 8th International Conference on Knowledge and Smart Technology (KST)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 8th International Conference on Knowledge and Smart Technology (KST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/KST.2016.7440486","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Noise spectrum estimation is a fundamental component of speech enhancement and speech recognition systems. In this paper, we present pitch voice activity detection for noise estimation in the input with the low signal-to-noise ratio (SNR). The noise power spectrum is approximated by the minimum level of power over the period of the mixed signal. Since the noise level may not be stationary, the algorithm should regularly update the estimation. However, when a period contains a speech signal, the spectral power is rather high, and the noise level tends to be overestimated. To avoid this problem, we firstly use voice activity detector to detect the speech presence in the signal period. We propose that the pitch information is efficient in identifying speech activity in the mixed signal even under a low SNR condition. We adopt the Robust Algorithm for Pitch Tracking (RAPT) together with the ratio between the frame power and minimum spectral power to classify voice activity in the input frame. Under low SNR condition, the experimental result showed that the proposed algorithm is effective. The proposed algorithm achieves lower estimation errors when compared to Continuous spectrum minima tracking, Minima Control Recursive Averaging (MCRA) and the estimation based solely on the spectral power ratio.