{"title":"基于HMM的Isarn数字语音识别","authors":"Sasithron Sangjamraschaikun, Pusadee Seresangtakul","doi":"10.1109/INCIT.2017.8257882","DOIUrl":null,"url":null,"abstract":"Herein we present an automatic digit-speech recognition system for the Isarn language, which is a dialect spoken in the northeast of Thailand. In this work, an Isarn digit corpus was collected from natives speakers. The system utilizes the Mel Frequency Cepstral Coefficients (MFCC) technique to extract speech features, and the Hidden Markov Model (HMM) classifier for speech recognition. The paper focuses on isolated and continuous speech recognition for speakers (dependent and independent) uttering Isarn numerals (from 0 through 999). The system was evaluated by correctness. The results obtained from isolated recognition in speaker dependence and speaker independence were 90.00% and 79.80%, respectively; whereas continuous recognition provided results of 89.16% in speaker dependence and 82.47% in speaker dependence.","PeriodicalId":405827,"journal":{"name":"2017 2nd International Conference on Information Technology (INCIT)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Isarn digit speech recognition using HMM\",\"authors\":\"Sasithron Sangjamraschaikun, Pusadee Seresangtakul\",\"doi\":\"10.1109/INCIT.2017.8257882\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Herein we present an automatic digit-speech recognition system for the Isarn language, which is a dialect spoken in the northeast of Thailand. In this work, an Isarn digit corpus was collected from natives speakers. The system utilizes the Mel Frequency Cepstral Coefficients (MFCC) technique to extract speech features, and the Hidden Markov Model (HMM) classifier for speech recognition. The paper focuses on isolated and continuous speech recognition for speakers (dependent and independent) uttering Isarn numerals (from 0 through 999). The system was evaluated by correctness. The results obtained from isolated recognition in speaker dependence and speaker independence were 90.00% and 79.80%, respectively; whereas continuous recognition provided results of 89.16% in speaker dependence and 82.47% in speaker dependence.\",\"PeriodicalId\":405827,\"journal\":{\"name\":\"2017 2nd International Conference on Information Technology (INCIT)\",\"volume\":\"42 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 2nd International Conference on Information Technology (INCIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/INCIT.2017.8257882\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 2nd International Conference on Information Technology (INCIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INCIT.2017.8257882","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Herein we present an automatic digit-speech recognition system for the Isarn language, which is a dialect spoken in the northeast of Thailand. In this work, an Isarn digit corpus was collected from natives speakers. The system utilizes the Mel Frequency Cepstral Coefficients (MFCC) technique to extract speech features, and the Hidden Markov Model (HMM) classifier for speech recognition. The paper focuses on isolated and continuous speech recognition for speakers (dependent and independent) uttering Isarn numerals (from 0 through 999). The system was evaluated by correctness. The results obtained from isolated recognition in speaker dependence and speaker independence were 90.00% and 79.80%, respectively; whereas continuous recognition provided results of 89.16% in speaker dependence and 82.47% in speaker dependence.