{"title":"Efficient voice activity detection in reverberant enclosures using far field microphones","authors":"Theodore Petsatodis, Christos Boukis","doi":"10.1109/ICDSP.2009.5201159","DOIUrl":null,"url":null,"abstract":"An algorithm suitable for voice activity detection under reverberant conditions is proposed in this paper. Due to the use of far-filed microphones the proposed solution processes speech signals of highly-varying intensity and signal to noise ratio, that are contaminated with several echoes. The core of the system is a pair of Hidden Markov Models, that effectively model the speech presence and speech absence situations. To minimise mis-detections an adaptive threshold is used, while a hang-over scheme caters for the intra-frame correlation of speech signals. Experimental results conducted in a typical office room using a single far field microphone to support the analysis.","PeriodicalId":409669,"journal":{"name":"2009 16th International Conference on Digital Signal Processing","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 16th International Conference on Digital Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDSP.2009.5201159","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
An algorithm suitable for voice activity detection under reverberant conditions is proposed in this paper. Due to the use of far-filed microphones the proposed solution processes speech signals of highly-varying intensity and signal to noise ratio, that are contaminated with several echoes. The core of the system is a pair of Hidden Markov Models, that effectively model the speech presence and speech absence situations. To minimise mis-detections an adaptive threshold is used, while a hang-over scheme caters for the intra-frame correlation of speech signals. Experimental results conducted in a typical office room using a single far field microphone to support the analysis.