{"title":"利用新颖的多重分形级联特征改善基于MFCC的ASI系统性能","authors":"L. Ling, D. C. González","doi":"10.1109/ICICIP.2014.7010326","DOIUrl":null,"url":null,"abstract":"In this work we use a set of multifractal features, namely Variable Variance Gaussian Parameter (WGP), extracted from a cascade model of speech signals to improve the performances of a traditional speaker recognition approach. We describe in detail the stochastic cascade model used to represents these WGP features as well as the proper feature extraction procedure. The evaluation of the discriminative capability of the WGP features is carried out in two steps. First we implement an automatic text-independent speaker identification system based only on the WGP features and Gaussian mixture model (GMM) classifiers. Then, we evaluate classification strategies that jointly use both the WGP and traditional mel-frequency cepstrum coefficients (MFCCs) features under two multimodal fusion schemes, namely score-level and feature-level fusion. Experimental tests reveal that the WGP features are discriminant and capable of improving the performance of MFCC based ASI systems.","PeriodicalId":408041,"journal":{"name":"Fifth International Conference on Intelligent Control and Information Processing","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Improving MFCC based ASI system performance using novel multifractal cascade features\",\"authors\":\"L. Ling, D. C. González\",\"doi\":\"10.1109/ICICIP.2014.7010326\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this work we use a set of multifractal features, namely Variable Variance Gaussian Parameter (WGP), extracted from a cascade model of speech signals to improve the performances of a traditional speaker recognition approach. We describe in detail the stochastic cascade model used to represents these WGP features as well as the proper feature extraction procedure. The evaluation of the discriminative capability of the WGP features is carried out in two steps. First we implement an automatic text-independent speaker identification system based only on the WGP features and Gaussian mixture model (GMM) classifiers. Then, we evaluate classification strategies that jointly use both the WGP and traditional mel-frequency cepstrum coefficients (MFCCs) features under two multimodal fusion schemes, namely score-level and feature-level fusion. Experimental tests reveal that the WGP features are discriminant and capable of improving the performance of MFCC based ASI systems.\",\"PeriodicalId\":408041,\"journal\":{\"name\":\"Fifth International Conference on Intelligent Control and Information Processing\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Fifth International Conference on Intelligent Control and Information Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICICIP.2014.7010326\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fifth International Conference on Intelligent Control and Information Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICIP.2014.7010326","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Improving MFCC based ASI system performance using novel multifractal cascade features
In this work we use a set of multifractal features, namely Variable Variance Gaussian Parameter (WGP), extracted from a cascade model of speech signals to improve the performances of a traditional speaker recognition approach. We describe in detail the stochastic cascade model used to represents these WGP features as well as the proper feature extraction procedure. The evaluation of the discriminative capability of the WGP features is carried out in two steps. First we implement an automatic text-independent speaker identification system based only on the WGP features and Gaussian mixture model (GMM) classifiers. Then, we evaluate classification strategies that jointly use both the WGP and traditional mel-frequency cepstrum coefficients (MFCCs) features under two multimodal fusion schemes, namely score-level and feature-level fusion. Experimental tests reveal that the WGP features are discriminant and capable of improving the performance of MFCC based ASI systems.