Mohamed Hassan Mohamed, Ashraf Mohamed Ali Hassan, N. M. Hussein Hassan
{"title":"基于增强小波包最佳树编码(EWPBTE)特征的语音自动标注","authors":"Mohamed Hassan Mohamed, Ashraf Mohamed Ali Hassan, N. M. Hussein Hassan","doi":"10.1109/ICEEOT.2016.7755165","DOIUrl":null,"url":null,"abstract":"This paper aimed at introducing a completely automated Arabic phone recognition system based on Enhanced Wavelet Packets Best Tree Encoding (EWPBTE) 15-point speech feature. The process of enhancing of WPBTE is provided by adding energy component to WPBTE, which is implemented in Matlab software and makes an enhancement of 65 % to recognizer accuracy which is the most contribution in this paper. EWPBTE is used to find phoneme boundaries along speech utterance. Hidden Markov Model (HMM) and Gaussian Mixtures are used for building the statistical models through this research. HMM Tool Kit (HTK) software is utilized for implementation of the model. The System can identify spoken phone at 57.01% recognition rate based on Mel Frequency Cepstral Coefficients (MFCC), 21.07% recognition rate based on WPBTE and 86.23% recognition rate based on EWPBTE. The proposed EWPBTE vector is 15 components compared to 39 components of MFCC. This makes it very promising features vector to be under research and in development phase.","PeriodicalId":383674,"journal":{"name":"2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT)","volume":"132 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automatic speech annotation based on enhanced wavelet Packets Best Tree Encoding (EWPBTE) feature\",\"authors\":\"Mohamed Hassan Mohamed, Ashraf Mohamed Ali Hassan, N. M. Hussein Hassan\",\"doi\":\"10.1109/ICEEOT.2016.7755165\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper aimed at introducing a completely automated Arabic phone recognition system based on Enhanced Wavelet Packets Best Tree Encoding (EWPBTE) 15-point speech feature. The process of enhancing of WPBTE is provided by adding energy component to WPBTE, which is implemented in Matlab software and makes an enhancement of 65 % to recognizer accuracy which is the most contribution in this paper. EWPBTE is used to find phoneme boundaries along speech utterance. Hidden Markov Model (HMM) and Gaussian Mixtures are used for building the statistical models through this research. HMM Tool Kit (HTK) software is utilized for implementation of the model. The System can identify spoken phone at 57.01% recognition rate based on Mel Frequency Cepstral Coefficients (MFCC), 21.07% recognition rate based on WPBTE and 86.23% recognition rate based on EWPBTE. The proposed EWPBTE vector is 15 components compared to 39 components of MFCC. This makes it very promising features vector to be under research and in development phase.\",\"PeriodicalId\":383674,\"journal\":{\"name\":\"2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT)\",\"volume\":\"132 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-03-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICEEOT.2016.7755165\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEEOT.2016.7755165","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Automatic speech annotation based on enhanced wavelet Packets Best Tree Encoding (EWPBTE) feature
This paper aimed at introducing a completely automated Arabic phone recognition system based on Enhanced Wavelet Packets Best Tree Encoding (EWPBTE) 15-point speech feature. The process of enhancing of WPBTE is provided by adding energy component to WPBTE, which is implemented in Matlab software and makes an enhancement of 65 % to recognizer accuracy which is the most contribution in this paper. EWPBTE is used to find phoneme boundaries along speech utterance. Hidden Markov Model (HMM) and Gaussian Mixtures are used for building the statistical models through this research. HMM Tool Kit (HTK) software is utilized for implementation of the model. The System can identify spoken phone at 57.01% recognition rate based on Mel Frequency Cepstral Coefficients (MFCC), 21.07% recognition rate based on WPBTE and 86.23% recognition rate based on EWPBTE. The proposed EWPBTE vector is 15 components compared to 39 components of MFCC. This makes it very promising features vector to be under research and in development phase.