M. Asgari, A. Sayadian, M. Farhadloo, E. A. Mehrizi
{"title":"基于频谱域熵的语音活动检测","authors":"M. Asgari, A. Sayadian, M. Farhadloo, E. A. Mehrizi","doi":"10.1109/ATNAC.2008.4783359","DOIUrl":null,"url":null,"abstract":"In this paper we develop a voice activity detection algorithm based on entropy estimation of magnitude spectrum. In addition, the likelihood ratio test (LRT) is employed to determine a threshold to separate of speech segments from non-speech segments. The distributions of entropy magnitude of clean speech and noise signal are assumed to be Gaussian. The application of the concept of entropy to the speech detection problem is based on the assumption that the signal spectrum is more organized during speech segments than during noise segments. One of the main advantages of this method is that it is not very sensitive to the changes of noise level. Our simulation results show that the entropy based VAD is high performance in low signal to noise ratio (SNR) conditions (SNR < 0 dB).","PeriodicalId":143803,"journal":{"name":"2008 Australasian Telecommunication Networks and Applications Conference","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Voice Activity Detection Using Entropy in Spectrum Domain\",\"authors\":\"M. Asgari, A. Sayadian, M. Farhadloo, E. A. Mehrizi\",\"doi\":\"10.1109/ATNAC.2008.4783359\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper we develop a voice activity detection algorithm based on entropy estimation of magnitude spectrum. In addition, the likelihood ratio test (LRT) is employed to determine a threshold to separate of speech segments from non-speech segments. The distributions of entropy magnitude of clean speech and noise signal are assumed to be Gaussian. The application of the concept of entropy to the speech detection problem is based on the assumption that the signal spectrum is more organized during speech segments than during noise segments. One of the main advantages of this method is that it is not very sensitive to the changes of noise level. Our simulation results show that the entropy based VAD is high performance in low signal to noise ratio (SNR) conditions (SNR < 0 dB).\",\"PeriodicalId\":143803,\"journal\":{\"name\":\"2008 Australasian Telecommunication Networks and Applications Conference\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 Australasian Telecommunication Networks and Applications Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ATNAC.2008.4783359\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 Australasian Telecommunication Networks and Applications Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ATNAC.2008.4783359","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Voice Activity Detection Using Entropy in Spectrum Domain
In this paper we develop a voice activity detection algorithm based on entropy estimation of magnitude spectrum. In addition, the likelihood ratio test (LRT) is employed to determine a threshold to separate of speech segments from non-speech segments. The distributions of entropy magnitude of clean speech and noise signal are assumed to be Gaussian. The application of the concept of entropy to the speech detection problem is based on the assumption that the signal spectrum is more organized during speech segments than during noise segments. One of the main advantages of this method is that it is not very sensitive to the changes of noise level. Our simulation results show that the entropy based VAD is high performance in low signal to noise ratio (SNR) conditions (SNR < 0 dB).