Pronaya Prosun Das, S. M. Allayear, Ruhul Amin, Zahida Rahman
{"title":"Bangladeshi dialect recognition using Mel Frequency Cepstral Coefficient, Delta, Delta-delta and Gaussian Mixture Model","authors":"Pronaya Prosun Das, S. M. Allayear, Ruhul Amin, Zahida Rahman","doi":"10.1109/ICACI.2016.7449852","DOIUrl":null,"url":null,"abstract":"Automatic recognition systems are generally applied successfully in speech processing to categorize observed utterances by the speaker identity, dialect and linguistic communication. A lot of research has been performed to detect speeches, dialects and languages of different region throughout the world. But the work on dialects of Bangladesh is infrequent to our research. These dialects, in turn, differ quite a bit from each other. In this paper, we present a method to detect Bangladeshi different dialects which utilizes Mel Frequency Cepstral Coefficient (MFCC), its Delta and Delta-delta as main features and Gaussian Mixture Models (GMM) to classify characteristics of a specific dialect. Particularly we extract the MFCCs, Deltas and Delta-deltas from the speech signal. Then they are merged together to form a feature vector for a specific dialect. GMM is trained using the iterative Expectation Maximization (EM) algorithm where feature vectors are served as input. This scheme is tested on 5 databases of 30 speech samples each. Speech samples contain dialects of Borishal, Noakhali, Sylhet, Chittagong and Chapai Nawabganj regions of Bangladesh. Experiments show that GMM adaptation gives comparable good performance.","PeriodicalId":211040,"journal":{"name":"2016 Eighth International Conference on Advanced Computational Intelligence (ICACI)","volume":"89 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"25","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 Eighth International Conference on Advanced Computational Intelligence (ICACI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICACI.2016.7449852","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 25
Abstract
Automatic recognition systems are generally applied successfully in speech processing to categorize observed utterances by the speaker identity, dialect and linguistic communication. A lot of research has been performed to detect speeches, dialects and languages of different region throughout the world. But the work on dialects of Bangladesh is infrequent to our research. These dialects, in turn, differ quite a bit from each other. In this paper, we present a method to detect Bangladeshi different dialects which utilizes Mel Frequency Cepstral Coefficient (MFCC), its Delta and Delta-delta as main features and Gaussian Mixture Models (GMM) to classify characteristics of a specific dialect. Particularly we extract the MFCCs, Deltas and Delta-deltas from the speech signal. Then they are merged together to form a feature vector for a specific dialect. GMM is trained using the iterative Expectation Maximization (EM) algorithm where feature vectors are served as input. This scheme is tested on 5 databases of 30 speech samples each. Speech samples contain dialects of Borishal, Noakhali, Sylhet, Chittagong and Chapai Nawabganj regions of Bangladesh. Experiments show that GMM adaptation gives comparable good performance.