使用基于云的智能手机应用程序“Kannada Kali”对孤立的卡纳达语单词进行发音训练

2018 IEEE International Conference on Cloud Computing in Emerging Markets (CCEM) Pub Date : 2018-11-01 DOI:10.1109/CCEM.2018.00017

Savitha Murthy, Ankit Anand, Avinash Kumar, Ajaykumar S. Cholin, Ankita Shetty, Aditya D. Bhat, Akshay Venkatesh, Lingaraj Kothiwale, D. Sitaram, Viraj Kumar

{"title":"使用基于云的智能手机应用程序“Kannada Kali”对孤立的卡纳达语单词进行发音训练","authors":"Savitha Murthy, Ankit Anand, Avinash Kumar, Ajaykumar S. Cholin, Ankita Shetty, Aditya D. Bhat, Akshay Venkatesh, Lingaraj Kothiwale, D. Sitaram, Viraj Kumar","doi":"10.1109/CCEM.2018.00017","DOIUrl":null,"url":null,"abstract":"Automated feedback on pronunciation system on a smart phone is useful for a student trying to learn a new language at his or her own pace. The objective of our re-search is to implement a pronunciation training system with minimal language specific data. Our proposed system consists of an Android application as a front-end, and a pronunciation evaluation and mispronunciation detection framework as the back-end hosted on a cloud. We conduct our experiments on spoken isolated words in Kannada. Our pronunciation evaluation(for spoken word) implementation on the cloud involves training a classifier with features from Dynamic Time Warping (DTW) with Mel Frequency Cepstral Coefficients (MFCC) and Line Spectral Frequencies (LSF) and, without directly on LSF (without DTW). We study the performance of different machine learning algorithms for pronunciation rating. We propose a novel semi-supervised approach for detecting mispronounced segments of a word using Self Organizing Maps (SOM) that are also deployed on the cloud. Our implementation of SOM learns the features of an automatically segmented reference speech. The trained SOM is then used to determine the deviations in the learner's pronunciation. We evaluate our system on 1169 Kannada audio samples from students around 18 to 25 years of age. The Kannada words considered are taken from textbooks of first and second grade (considering learners as beginners who do not know Kannada) and include 2 to 5 syllable words. We report accuracy on binary classification and multi-class classification for different classifiers. The mispronounced segments detected using SOM correlate with the human ratings. Our approach of pronunciation evaluation and mispronunciation detection is based on minimal data and does not require a speech recognition system.","PeriodicalId":156315,"journal":{"name":"2018 IEEE International Conference on Cloud Computing in Emerging Markets (CCEM)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Pronunciation Training on Isolated Kannada Words Using \\\"Kannada Kali\\\" - A Cloud Based Smart Phone Application\",\"authors\":\"Savitha Murthy, Ankit Anand, Avinash Kumar, Ajaykumar S. Cholin, Ankita Shetty, Aditya D. Bhat, Akshay Venkatesh, Lingaraj Kothiwale, D. Sitaram, Viraj Kumar\",\"doi\":\"10.1109/CCEM.2018.00017\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automated feedback on pronunciation system on a smart phone is useful for a student trying to learn a new language at his or her own pace. The objective of our re-search is to implement a pronunciation training system with minimal language specific data. Our proposed system consists of an Android application as a front-end, and a pronunciation evaluation and mispronunciation detection framework as the back-end hosted on a cloud. We conduct our experiments on spoken isolated words in Kannada. Our pronunciation evaluation(for spoken word) implementation on the cloud involves training a classifier with features from Dynamic Time Warping (DTW) with Mel Frequency Cepstral Coefficients (MFCC) and Line Spectral Frequencies (LSF) and, without directly on LSF (without DTW). We study the performance of different machine learning algorithms for pronunciation rating. We propose a novel semi-supervised approach for detecting mispronounced segments of a word using Self Organizing Maps (SOM) that are also deployed on the cloud. Our implementation of SOM learns the features of an automatically segmented reference speech. The trained SOM is then used to determine the deviations in the learner's pronunciation. We evaluate our system on 1169 Kannada audio samples from students around 18 to 25 years of age. The Kannada words considered are taken from textbooks of first and second grade (considering learners as beginners who do not know Kannada) and include 2 to 5 syllable words. We report accuracy on binary classification and multi-class classification for different classifiers. The mispronounced segments detected using SOM correlate with the human ratings. Our approach of pronunciation evaluation and mispronunciation detection is based on minimal data and does not require a speech recognition system.\",\"PeriodicalId\":156315,\"journal\":{\"name\":\"2018 IEEE International Conference on Cloud Computing in Emerging Markets (CCEM)\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE International Conference on Cloud Computing in Emerging Markets (CCEM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCEM.2018.00017\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Conference on Cloud Computing in Emerging Markets (CCEM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCEM.2018.00017","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

智能手机上语音系统的自动反馈对于想要按照自己的节奏学习一门新语言的学生来说是很有用的。我们的研究目标是用最少的语言特定数据实现一个发音训练系统。我们提出的系统包括一个Android应用程序作为前端，一个语音评估和发音错误检测框架作为后端托管在云上。我们用卡纳达语的孤立单词进行实验。我们的发音评估(口语单词)在云上的实现包括训练一个分类器，该分类器具有动态时间扭曲(DTW)的特征，具有Mel频率倒谱系数(MFCC)和线谱频率(LSF)，并且没有直接在LSF上(没有DTW)。我们研究了不同机器学习算法在发音评分方面的性能。我们提出了一种新的半监督方法，用于使用部署在云上的自组织地图(SOM)来检测单词的发音错误片段。我们的SOM实现学习了自动分割参考语音的特征。然后使用经过训练的SOM来确定学习者发音中的偏差。我们用1169个来自18到25岁学生的卡纳达语音频样本来评估我们的系统。所考虑的卡纳达语词汇取自一年级和二年级的教科书(考虑到学习者是不懂卡纳达语的初学者)，包括2至5个音节的单词。我们报告了不同分类器在二元分类和多类分类上的准确率。使用SOM检测到的发音错误片段与人类评分相关。我们的发音评估和发音错误检测方法是基于最小的数据，不需要语音识别系统。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Pronunciation Training on Isolated Kannada Words Using "Kannada Kali" - A Cloud Based Smart Phone Application

Automated feedback on pronunciation system on a smart phone is useful for a student trying to learn a new language at his or her own pace. The objective of our re-search is to implement a pronunciation training system with minimal language specific data. Our proposed system consists of an Android application as a front-end, and a pronunciation evaluation and mispronunciation detection framework as the back-end hosted on a cloud. We conduct our experiments on spoken isolated words in Kannada. Our pronunciation evaluation(for spoken word) implementation on the cloud involves training a classifier with features from Dynamic Time Warping (DTW) with Mel Frequency Cepstral Coefficients (MFCC) and Line Spectral Frequencies (LSF) and, without directly on LSF (without DTW). We study the performance of different machine learning algorithms for pronunciation rating. We propose a novel semi-supervised approach for detecting mispronounced segments of a word using Self Organizing Maps (SOM) that are also deployed on the cloud. Our implementation of SOM learns the features of an automatically segmented reference speech. The trained SOM is then used to determine the deviations in the learner's pronunciation. We evaluate our system on 1169 Kannada audio samples from students around 18 to 25 years of age. The Kannada words considered are taken from textbooks of first and second grade (considering learners as beginners who do not know Kannada) and include 2 to 5 syllable words. We report accuracy on binary classification and multi-class classification for different classifiers. The mispronounced segments detected using SOM correlate with the human ratings. Our approach of pronunciation evaluation and mispronunciation detection is based on minimal data and does not require a speech recognition system.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2018 IEEE International Conference on Cloud Computing in Emerging Markets (CCEM)

自引率

0.00%

发文量