Creating language and acoustic models using Kaldi to build an automatic speech recognition system for Kannada language

Thimmaraja G. Yadava
{"title":"Creating language and acoustic models using Kaldi to build an automatic speech recognition system for Kannada language","authors":"Thimmaraja G. Yadava","doi":"10.1109/RTEICT.2017.8256578","DOIUrl":null,"url":null,"abstract":"In this paper, creation of the Language Models (LMs) and Acoustic Models (AMs) using Kaldi speech recognition toolkit to build a robust Automatic Speech Recognition (ASR) system for Kannada language is demonstrated. The speech data is collected from the farmers of Karnataka under uncontrolled environment is used for the development of ASR models. The collected speech data needs to be translated to machine level language and hence the Indic Language Transliteration Tool (IT3 to UTF-8) is used for transcription. The dictionary for the collected speech data is created by using Indian Language Speech sound Label (ILSL12) set. The AMs are created by using Gaussian Mixture Model (GMM) and Subspace GMM (SGMM). The 80% and 20% of validated speech data is used for training and testing respectively. The accuracy and Word Error Rate (WER) of ASR models are highlighted and discussed in this work. The developed ASR models can be used in spoken query system which enables the farmers to access the on time agricultural commodity prices and weather information in Kannada language.","PeriodicalId":342831,"journal":{"name":"2017 2nd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 2nd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RTEICT.2017.8256578","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

Abstract

In this paper, creation of the Language Models (LMs) and Acoustic Models (AMs) using Kaldi speech recognition toolkit to build a robust Automatic Speech Recognition (ASR) system for Kannada language is demonstrated. The speech data is collected from the farmers of Karnataka under uncontrolled environment is used for the development of ASR models. The collected speech data needs to be translated to machine level language and hence the Indic Language Transliteration Tool (IT3 to UTF-8) is used for transcription. The dictionary for the collected speech data is created by using Indian Language Speech sound Label (ILSL12) set. The AMs are created by using Gaussian Mixture Model (GMM) and Subspace GMM (SGMM). The 80% and 20% of validated speech data is used for training and testing respectively. The accuracy and Word Error Rate (WER) of ASR models are highlighted and discussed in this work. The developed ASR models can be used in spoken query system which enables the farmers to access the on time agricultural commodity prices and weather information in Kannada language.
使用Kaldi创建语言和声学模型,构建了卡纳达语的自动语音识别系统
本文演示了使用Kaldi语音识别工具包创建语言模型(LMs)和声学模型(AMs),以构建一个鲁棒的卡纳达语自动语音识别(ASR)系统。在不受控制的环境下,从卡纳塔克邦农民那里收集的语音数据用于开发ASR模型。收集到的语音数据需要翻译成机器级语言,因此使用印度语音译工具(IT3到UTF-8)进行转录。使用印度语语音标签(ILSL12)集创建所收集语音数据的字典。采用高斯混合模型(GMM)和子空间高斯混合模型(SGMM)建立了高斯混合模型。将验证语音数据的80%和20%分别用于训练和测试。本文重点讨论了自动语音识别模型的准确率和单词错误率。所开发的ASR模型可用于语音查询系统,使农民能够以卡纳达语及时获取农产品价格和天气信息。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信