{"title":"Haar-like filtering based speech detection using integral signal for sensornet","authors":"J. Nishimura, T. Kuroda","doi":"10.1109/ICSENST.2008.4757072","DOIUrl":null,"url":null,"abstract":"Speech detection using Haar - like filtering is proposed as a new and very low calculation cost method for sensornet applications. The simple Haar - like filters having variable filter width and shift width are trained to learn appropriate filter parameters from the training samples to detect speech. To further decrease the calculation cost, the use of intermediate signal representation called ldquointegral signalrdquo is proposed. Our method yielded speech/nonspeech classification accuracy of 97.44% for the input length of 0.1 s. Compared with high performance feature extraction method MFCC (mel-frequency cepstrum coefficient), the proposed haar-like filtering can be approximately 93.71% efficient in terms of the total amount of add and multiply calculations while capable of achieving the error rate of only 2.56% relative to MFCC.","PeriodicalId":6299,"journal":{"name":"2008 3rd International Conference on Sensing Technology","volume":"11 1","pages":"52-56"},"PeriodicalIF":0.0000,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 3rd International Conference on Sensing Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSENST.2008.4757072","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Speech detection using Haar - like filtering is proposed as a new and very low calculation cost method for sensornet applications. The simple Haar - like filters having variable filter width and shift width are trained to learn appropriate filter parameters from the training samples to detect speech. To further decrease the calculation cost, the use of intermediate signal representation called ldquointegral signalrdquo is proposed. Our method yielded speech/nonspeech classification accuracy of 97.44% for the input length of 0.1 s. Compared with high performance feature extraction method MFCC (mel-frequency cepstrum coefficient), the proposed haar-like filtering can be approximately 93.71% efficient in terms of the total amount of add and multiply calculations while capable of achieving the error rate of only 2.56% relative to MFCC.