Comparing human and machine speech recognition in noise with QuickSIN.

IF 1.2 Q3 ACOUSTICS
Malcolm Slaney, Matthew B Fitzgerald
{"title":"Comparing human and machine speech recognition in noise with QuickSIN.","authors":"Malcolm Slaney, Matthew B Fitzgerald","doi":"10.1121/10.0028612","DOIUrl":null,"url":null,"abstract":"<p><p>A test is proposed to characterize the performance of speech recognition systems. The QuickSIN test is used by audiologists to measure the ability of humans to recognize continuous speech in noise. This test yields the signal-to-noise ratio at which individuals can correctly recognize 50% of the keywords in low-context sentences. It is argued that a metric for automatic speech recognizers will ground the performance of automatic speech-in-noise recognizers to human abilities. Here, it is demonstrated that the performance of modern recognizers, built using millions of hours of unsupervised training data, is anywhere from normal to mildly impaired in noise compared to human participants.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":null,"pages":null},"PeriodicalIF":1.2000,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JASA express letters","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1121/10.0028612","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ACOUSTICS","Score":null,"Total":0}
引用次数: 0

Abstract

A test is proposed to characterize the performance of speech recognition systems. The QuickSIN test is used by audiologists to measure the ability of humans to recognize continuous speech in noise. This test yields the signal-to-noise ratio at which individuals can correctly recognize 50% of the keywords in low-context sentences. It is argued that a metric for automatic speech recognizers will ground the performance of automatic speech-in-noise recognizers to human abilities. Here, it is demonstrated that the performance of modern recognizers, built using millions of hours of unsupervised training data, is anywhere from normal to mildly impaired in noise compared to human participants.

用 QuickSIN 比较噪声中的人工和机器语音识别。
我们提出了一种测试方法来鉴定语音识别系统的性能。听力学家使用 QuickSIN 测试来测量人类在噪声中识别连续语音的能力。该测试可得出个人能正确识别低语境句子中 50% 关键字的信噪比。有人认为,自动语音识别器的衡量标准将使自动噪声语音识别器的性能与人类的能力相一致。本文证明,与人类参与者相比,使用数百万小时无监督训练数据构建的现代识别器在噪声中的表现从正常到轻微受损不等。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
1.70
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信