{"title":"Language Specific Information from LP Residual Signal Using Linear Sub Band Filters","authors":"Soma Siddhartha, Jagabandhu Mishra, S. Prasanna","doi":"10.1109/NCC48643.2020.9056005","DOIUrl":null,"url":null,"abstract":"In this work, an analysis and comparison of the parameterization methods of excitation source information is demonstrated for the spoken language recognition task. The excitation source information is represented by the features called residual mel frequency cepstral coefficients (RMFCC) and residual linear frequency cepstral coefficients (RLFCC), both derived from the linear prediction residual signal. In general, inspired from the speaker recognition task, perceptually inspired mel-sub band filters are used for the parameterization of LP residual signal (known as RMFCC). In this study, as the LP residual signal is impulsive in nature (i.e. having a flat spectrum) a uniform triangular sub-band filter based parameterization method, called as RLFCC is proposed. From the experimental results, it has been observed that the RLFCC features perform better than RMFCC features. The RLFCC features combined with the MFCC features provide a relative improvement of 20% in terms of EERavg over the combined system of MFCC and RMFCC features using DNN-WA architecture.","PeriodicalId":183772,"journal":{"name":"2020 National Conference on Communications (NCC)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 National Conference on Communications (NCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NCC48643.2020.9056005","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
In this work, an analysis and comparison of the parameterization methods of excitation source information is demonstrated for the spoken language recognition task. The excitation source information is represented by the features called residual mel frequency cepstral coefficients (RMFCC) and residual linear frequency cepstral coefficients (RLFCC), both derived from the linear prediction residual signal. In general, inspired from the speaker recognition task, perceptually inspired mel-sub band filters are used for the parameterization of LP residual signal (known as RMFCC). In this study, as the LP residual signal is impulsive in nature (i.e. having a flat spectrum) a uniform triangular sub-band filter based parameterization method, called as RLFCC is proposed. From the experimental results, it has been observed that the RLFCC features perform better than RMFCC features. The RLFCC features combined with the MFCC features provide a relative improvement of 20% in terms of EERavg over the combined system of MFCC and RMFCC features using DNN-WA architecture.