Abualsoud Hanani, H. Basha, Y. Sharaf, Stephen Eugene Taylor
{"title":"Palestinian Arabic regional accent recognition","authors":"Abualsoud Hanani, H. Basha, Y. Sharaf, Stephen Eugene Taylor","doi":"10.1109/SPED.2015.7343088","DOIUrl":null,"url":null,"abstract":"We attempt to automatically recognize the speaker's accent among regional Arabic Palestinian accents from four different regions of Palestine, i.e. Jerusalem (JE), Hebron (HE), Nablus (NA) and Ramallah (RA). To achieve this goal, we applied the state of the art techniques used in speaker and language identification, namely, Gaussian Mixture Model - Universal Background Model (GMM-UBM), Gaussian Mixture Model - Support Vector Machines (GMM-SVM) and I-vector framework. All of these systems were trained and tested on speech of 200 speakers. GMM-SVM and I-vector systems outperformed the baseline GMM-UBM system. The best result (accuracy of 81.5%) was obtained by an I-vector system with 64 Gaussian components, compared to an accuracy of 73.4% achieved by human listeners on the same testing utterances.","PeriodicalId":426074,"journal":{"name":"2015 International Conference on Speech Technology and Human-Computer Dialogue (SpeD)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on Speech Technology and Human-Computer Dialogue (SpeD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPED.2015.7343088","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
We attempt to automatically recognize the speaker's accent among regional Arabic Palestinian accents from four different regions of Palestine, i.e. Jerusalem (JE), Hebron (HE), Nablus (NA) and Ramallah (RA). To achieve this goal, we applied the state of the art techniques used in speaker and language identification, namely, Gaussian Mixture Model - Universal Background Model (GMM-UBM), Gaussian Mixture Model - Support Vector Machines (GMM-SVM) and I-vector framework. All of these systems were trained and tested on speech of 200 speakers. GMM-SVM and I-vector systems outperformed the baseline GMM-UBM system. The best result (accuracy of 81.5%) was obtained by an I-vector system with 64 Gaussian components, compared to an accuracy of 73.4% achieved by human listeners on the same testing utterances.