{"title":"PCA-based human auditory filter bank for speech recognition","authors":"V. D. Minh, Sungyoung Lee","doi":"10.1109/SPCOM.2004.1458488","DOIUrl":null,"url":null,"abstract":"Although Mel-frequency Cepstral Coefficients (MFCC) has been proven to perform very well under most conditions, some limited efforts have been made in optimizing the shape of the filters in the filter-bank. In addition, MFCC does not approximate the critical bandwidth of the human auditory system. This paper presents a new feature extraction approach that (1) decouples filter bandwidth from other filter bank parameters inspired by the critical bands of the human auditory system and (2) designs the shape of the filters in the filter-bank. In this new approach, determining filter bandwidth is based on the approximation of critical band equivalent rectangular and the filter-bank coefficients are data-driven obtained by applying the principal component analysis (PCA) on the FFT spectrum of the training data. Though the experiments, we proved the noise robustness of this approach and the better performance of recognition systems.","PeriodicalId":424981,"journal":{"name":"2004 International Conference on Signal Processing and Communications, 2004. SPCOM '04.","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2004 International Conference on Signal Processing and Communications, 2004. SPCOM '04.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPCOM.2004.1458488","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15
Abstract
Although Mel-frequency Cepstral Coefficients (MFCC) has been proven to perform very well under most conditions, some limited efforts have been made in optimizing the shape of the filters in the filter-bank. In addition, MFCC does not approximate the critical bandwidth of the human auditory system. This paper presents a new feature extraction approach that (1) decouples filter bandwidth from other filter bank parameters inspired by the critical bands of the human auditory system and (2) designs the shape of the filters in the filter-bank. In this new approach, determining filter bandwidth is based on the approximation of critical band equivalent rectangular and the filter-bank coefficients are data-driven obtained by applying the principal component analysis (PCA) on the FFT spectrum of the training data. Though the experiments, we proved the noise robustness of this approach and the better performance of recognition systems.