Abderrahim Ezzine, H. Satori, Mohamed Hamidi, K. Satori
{"title":"基于CMU SphinxTools的摩洛哥方言语音识别系统","authors":"Abderrahim Ezzine, H. Satori, Mohamed Hamidi, K. Satori","doi":"10.1109/ISCV49265.2020.9204250","DOIUrl":null,"url":null,"abstract":"The main aim of an Automatic Speech Recognition system (ASR) is to produce a system that is able to simulate the human listener based on the learning approach and speech data of a studied language. In this paper, we describe the Darija Moroccan Dialect speech recognition system that is implemented to recognize the ten first Arabic digits spoken in Moroccan dialect (Darija) collected from 20 speakers including both males and females. This system is designed based on the CMU Sphinx tools through the ASR Hidden Markov Model method with small data and the Mel frequency spectral coefficients (MFCCs) that are used in the feature extraction phase. Our best-obtained accuracy is 96.27 % found with 8 GMMs.","PeriodicalId":313743,"journal":{"name":"2020 International Conference on Intelligent Systems and Computer Vision (ISCV)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Moroccan Dialect Speech Recognition System Based on CMU SphinxTools\",\"authors\":\"Abderrahim Ezzine, H. Satori, Mohamed Hamidi, K. Satori\",\"doi\":\"10.1109/ISCV49265.2020.9204250\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The main aim of an Automatic Speech Recognition system (ASR) is to produce a system that is able to simulate the human listener based on the learning approach and speech data of a studied language. In this paper, we describe the Darija Moroccan Dialect speech recognition system that is implemented to recognize the ten first Arabic digits spoken in Moroccan dialect (Darija) collected from 20 speakers including both males and females. This system is designed based on the CMU Sphinx tools through the ASR Hidden Markov Model method with small data and the Mel frequency spectral coefficients (MFCCs) that are used in the feature extraction phase. Our best-obtained accuracy is 96.27 % found with 8 GMMs.\",\"PeriodicalId\":313743,\"journal\":{\"name\":\"2020 International Conference on Intelligent Systems and Computer Vision (ISCV)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference on Intelligent Systems and Computer Vision (ISCV)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISCV49265.2020.9204250\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Intelligent Systems and Computer Vision (ISCV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCV49265.2020.9204250","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Moroccan Dialect Speech Recognition System Based on CMU SphinxTools
The main aim of an Automatic Speech Recognition system (ASR) is to produce a system that is able to simulate the human listener based on the learning approach and speech data of a studied language. In this paper, we describe the Darija Moroccan Dialect speech recognition system that is implemented to recognize the ten first Arabic digits spoken in Moroccan dialect (Darija) collected from 20 speakers including both males and females. This system is designed based on the CMU Sphinx tools through the ASR Hidden Markov Model method with small data and the Mel frequency spectral coefficients (MFCCs) that are used in the feature extraction phase. Our best-obtained accuracy is 96.27 % found with 8 GMMs.