{"title":"泰国学龄前儿童语音识别语音启用互动计数练习","authors":"Prapaporn Rattanatamrong, Onanong Kongmeesub, Tanakorn Dittaporn, Natphitchayuk Siwahansaphan, Sirapop Chatarupa, Vataya Chunwijitra, Sumonmas Thatphithakkul","doi":"10.1109/jcsse54890.2022.9836310","DOIUrl":null,"url":null,"abstract":"Over time, voice recognition technology has in-creased its capacity to understand the intricacy of children's speech, which has distinct pitches and vocalizations than adults'. However, obtaining outstanding results in children voice recog-nition, particularly in Thai, is hampered by a lack of sufficient dataset for children to train on. This paper describes our first steps in developing a speech recognition model for Thai children and its viability to be integrated with SmartMath, a Web-based interactive numerical skill practice application for preschoolers. In order to build an adequate recognizer for Thai children, two methodologies were investigated: spectrogram classification and GMM-HMM based ASR. The experimental results show that the GMM-HMM based ASR has the best WER, with a 4.23 percent reduction in error on the individual counting task when compared to the speech image categorization. For the incremental counting task, the best WER achieved by the ASR model is 6.81 percent. Further data analysis suggests potential ways for improving children's ASR, which could lead to the use of children's ASR to close the learning gap.","PeriodicalId":284735,"journal":{"name":"2022 19th International Joint Conference on Computer Science and Software Engineering (JCSSE)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Thai Preschooler Speech Recognition for Voice Enabled Interactive Counting Exercises\",\"authors\":\"Prapaporn Rattanatamrong, Onanong Kongmeesub, Tanakorn Dittaporn, Natphitchayuk Siwahansaphan, Sirapop Chatarupa, Vataya Chunwijitra, Sumonmas Thatphithakkul\",\"doi\":\"10.1109/jcsse54890.2022.9836310\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Over time, voice recognition technology has in-creased its capacity to understand the intricacy of children's speech, which has distinct pitches and vocalizations than adults'. However, obtaining outstanding results in children voice recog-nition, particularly in Thai, is hampered by a lack of sufficient dataset for children to train on. This paper describes our first steps in developing a speech recognition model for Thai children and its viability to be integrated with SmartMath, a Web-based interactive numerical skill practice application for preschoolers. In order to build an adequate recognizer for Thai children, two methodologies were investigated: spectrogram classification and GMM-HMM based ASR. The experimental results show that the GMM-HMM based ASR has the best WER, with a 4.23 percent reduction in error on the individual counting task when compared to the speech image categorization. For the incremental counting task, the best WER achieved by the ASR model is 6.81 percent. Further data analysis suggests potential ways for improving children's ASR, which could lead to the use of children's ASR to close the learning gap.\",\"PeriodicalId\":284735,\"journal\":{\"name\":\"2022 19th International Joint Conference on Computer Science and Software Engineering (JCSSE)\",\"volume\":\"25 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 19th International Joint Conference on Computer Science and Software Engineering (JCSSE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/jcsse54890.2022.9836310\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 19th International Joint Conference on Computer Science and Software Engineering (JCSSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/jcsse54890.2022.9836310","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Thai Preschooler Speech Recognition for Voice Enabled Interactive Counting Exercises
Over time, voice recognition technology has in-creased its capacity to understand the intricacy of children's speech, which has distinct pitches and vocalizations than adults'. However, obtaining outstanding results in children voice recog-nition, particularly in Thai, is hampered by a lack of sufficient dataset for children to train on. This paper describes our first steps in developing a speech recognition model for Thai children and its viability to be integrated with SmartMath, a Web-based interactive numerical skill practice application for preschoolers. In order to build an adequate recognizer for Thai children, two methodologies were investigated: spectrogram classification and GMM-HMM based ASR. The experimental results show that the GMM-HMM based ASR has the best WER, with a 4.23 percent reduction in error on the individual counting task when compared to the speech image categorization. For the incremental counting task, the best WER achieved by the ASR model is 6.81 percent. Further data analysis suggests potential ways for improving children's ASR, which could lead to the use of children's ASR to close the learning gap.