“Spanish Políglota”: an automatic Speech Recognition system based on HMM

Jonathan A. Zea, Josafá Aguiar
{"title":"“Spanish Políglota”: an automatic Speech Recognition system based on HMM","authors":"Jonathan A. Zea, Josafá Aguiar","doi":"10.1109/ICI2ST51859.2021.00011","DOIUrl":null,"url":null,"abstract":"The goal of this ASR system is to be able to recognize audio queries that request static translation of a given Spanish word into a specified language. We call this ASR system as the Spanish Políglota. The pronunciation dictionary for the language model is obtained by applying grapheme to phoneme conversion. It was developed via Festival Speech Synthesis Scheme scripts and the SPPAS Spanish lexicon. The possible audio queries are restricted by a BNF grammar we designed for this project. A triphone acoustic model was generated from a set of 1621 words audio recordings. This acoustic model is based on a N-gram model that estimates its probabilities based on the maximum likelihood estimation MLE. We evaluated the prediction of individual words, as well as of synthetic phrases. We generated 1577 synthetic phrases concatenating the words of our audio set. The performance was also measured over a new set of audio recordings from a different speaker. Evaluation of isolated word recognition achieved 77.91% of correct predictions. Nevertheless, the performance dropped when evaluating the synthetic phrases as well as the second speaker’s speech. We consider it is an initial step towards the development of a fully functional automatic speech recognition system.","PeriodicalId":148844,"journal":{"name":"2021 Second International Conference on Information Systems and Software Technologies (ICI2ST)","volume":"84 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Second International Conference on Information Systems and Software Technologies (ICI2ST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICI2ST51859.2021.00011","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The goal of this ASR system is to be able to recognize audio queries that request static translation of a given Spanish word into a specified language. We call this ASR system as the Spanish Políglota. The pronunciation dictionary for the language model is obtained by applying grapheme to phoneme conversion. It was developed via Festival Speech Synthesis Scheme scripts and the SPPAS Spanish lexicon. The possible audio queries are restricted by a BNF grammar we designed for this project. A triphone acoustic model was generated from a set of 1621 words audio recordings. This acoustic model is based on a N-gram model that estimates its probabilities based on the maximum likelihood estimation MLE. We evaluated the prediction of individual words, as well as of synthetic phrases. We generated 1577 synthetic phrases concatenating the words of our audio set. The performance was also measured over a new set of audio recordings from a different speaker. Evaluation of isolated word recognition achieved 77.91% of correct predictions. Nevertheless, the performance dropped when evaluating the synthetic phrases as well as the second speaker’s speech. We consider it is an initial step towards the development of a fully functional automatic speech recognition system.
“西班牙语Políglota”:基于HMM的自动语音识别系统
这个ASR系统的目标是能够识别请求将给定的西班牙语单词静态翻译成指定语言的音频查询。我们称这个ASR系统为西班牙语Políglota。将字素转换为音素,得到语言模型的发音字典。它是通过节日语音合成方案脚本和SPPAS西班牙语词典开发的。可能的音频查询受到我们为这个项目设计的BNF语法的限制。从一组1621个单词的录音中生成了一个三联音声学模型。该声学模型基于N-gram模型,该模型基于最大似然估计MLE估计其概率。我们评估了对单个单词和合成短语的预测。我们生成了1577个合成短语,将音频集的单词连接起来。这种表现也通过一组来自不同扬声器的新录音进行了测量。孤立词识别的评估准确率达到77.91%。然而,在评估合成短语和第二个说话者的演讲时,表现有所下降。我们认为这是迈向全功能自动语音识别系统的第一步。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信