Assessing costa rican children speech recognition by humans and machines

IF 0.1 Q4 MULTIDISCIPLINARY SCIENCES
Maribel Morales-Rodríguez, Marvin Coto-Jiménez
{"title":"Assessing costa rican children speech recognition by humans and machines","authors":"Maribel Morales-Rodríguez, Marvin Coto-Jiménez","doi":"10.18845/tm.v35i8.6453","DOIUrl":null,"url":null,"abstract":"In recent years, an increasing number of studies on human-computer interaction is taking place, due to the pervasive speech interfaces implemented in systems such as cell phones, personal and home automation assistants. These studies include automatic speech recognition (ASR) and speech synthesis, and are considering a wider variety of conditions of the signals, such as noise and reverberation, and accents and age-related effects as well. For example, one of the key challenges is the development of ASR for children’s speech. Since the current systems have a dependency on language and accents, thus, to improve it, the investigations of speech recognition technologies suitable for children are needed. In this paper, we assess commercial ASR systems for the recognition of Costa Rican children’s speech, for users with ages ranging between three and fourteen years old. To establish a comparison and numeric validation of the ASR systems in recognizing children’s isolated words, we conducted a large subjective listening test that computes the differences and challenges that remains for the state-of-the art ASR systems. The results provide evident numeric differences between ASR systems and human perceptions, especially for younger children. Additionally, we provide suggestions for future research directions in the field.","PeriodicalId":42957,"journal":{"name":"Tecnologia en Marcha","volume":null,"pages":null},"PeriodicalIF":0.1000,"publicationDate":"2022-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Tecnologia en Marcha","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18845/tm.v35i8.6453","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

In recent years, an increasing number of studies on human-computer interaction is taking place, due to the pervasive speech interfaces implemented in systems such as cell phones, personal and home automation assistants. These studies include automatic speech recognition (ASR) and speech synthesis, and are considering a wider variety of conditions of the signals, such as noise and reverberation, and accents and age-related effects as well. For example, one of the key challenges is the development of ASR for children’s speech. Since the current systems have a dependency on language and accents, thus, to improve it, the investigations of speech recognition technologies suitable for children are needed. In this paper, we assess commercial ASR systems for the recognition of Costa Rican children’s speech, for users with ages ranging between three and fourteen years old. To establish a comparison and numeric validation of the ASR systems in recognizing children’s isolated words, we conducted a large subjective listening test that computes the differences and challenges that remains for the state-of-the art ASR systems. The results provide evident numeric differences between ASR systems and human perceptions, especially for younger children. Additionally, we provide suggestions for future research directions in the field.
评估哥斯达黎加儿童的人类和机器语音识别能力
近年来,由于语音接口在手机、个人和家庭自动化助理等系统中的普及,对人机交互的研究越来越多。这些研究包括自动语音识别(ASR)和语音合成,并且正在考虑更广泛的信号条件,如噪音和混响,口音和年龄相关的影响。例如,其中一个关键挑战是儿童语言的ASR发展。由于目前的语音识别系统依赖于语言和口音,因此,为了改进它,需要研究适合儿童的语音识别技术。在本文中,我们评估了用于识别哥斯达黎加儿童语音的商业ASR系统,用户年龄在3到14岁之间。为了建立ASR系统在识别儿童孤立词方面的比较和数字验证,我们进行了一个大型的主观听力测试,计算了目前最先进的ASR系统的差异和挑战。结果提供了明显的ASR系统和人类感知之间的数字差异,特别是对于年幼的儿童。并对该领域未来的研究方向提出了建议。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Tecnologia en Marcha
Tecnologia en Marcha MULTIDISCIPLINARY SCIENCES-
自引率
0.00%
发文量
93
审稿时长
28 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信