Assessment of L2 intelligibility: Comparing L1 listeners and automatic speech recognition

IF 4.6 1区 文学 Q1 EDUCATION & EDUCATIONAL RESEARCH
Recall Pub Date : 2022-11-18 DOI:10.1017/S0958344022000192
Solène Inceoglu, Wen-Hsin Chen, Hyojung Lim
{"title":"Assessment of L2 intelligibility: Comparing L1 listeners and automatic speech recognition","authors":"Solène Inceoglu, Wen-Hsin Chen, Hyojung Lim","doi":"10.1017/S0958344022000192","DOIUrl":null,"url":null,"abstract":"Abstract An increasing number of studies are exploring the benefits of automatic speech recognition (ASR)–based dictation programs for second language (L2) pronunciation learning (e.g. Chen, Inceoglu & Lim, 2020; Liakin, Cardoso & Liakina, 2015; McCrocklin, 2019), but how ASR recognizes accented speech and the nature of the feedback it provides to language learners is still largely under-researched. The current study explores whether the intelligibility of L2 speakers differs when assessed by native (L1) listeners versus ASR technology, and reports on the types of intelligibility issues encountered by the two groups. Twelve L1 listeners of English transcribed 48 isolated words targeting the /ɪ-i/ and /æ-ε/ contrasts and 24 short sentences that four Taiwanese intermediate learners of English had produced using Google’s ASR dictation system. Overall, the results revealed lower intelligibility scores for the word task (ASR: 40.81%, L1 listeners: 38.62%) than the sentence task (ASR: 75.52%, L1 listeners: 83.88%), and highlighted strong similarities in the error types – and their proportions – identified by ASR and the L1 listeners. However, despite similar recognition scores, correlations indicated that the ASR recognition of the L2 speakers’ oral productions mirrored the L1 listeners’ judgments of intelligibility in the word and sentence tasks for only one speaker, with significant positive correlations for one additional speaker in each task. This suggests that the extent to which ASR approaches L1 listeners at recognizing accented speech may depend on individual speakers and the type of oral speech.","PeriodicalId":47046,"journal":{"name":"Recall","volume":"35 1","pages":"89 - 104"},"PeriodicalIF":4.6000,"publicationDate":"2022-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Recall","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1017/S0958344022000192","RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
引用次数: 1

Abstract

Abstract An increasing number of studies are exploring the benefits of automatic speech recognition (ASR)–based dictation programs for second language (L2) pronunciation learning (e.g. Chen, Inceoglu & Lim, 2020; Liakin, Cardoso & Liakina, 2015; McCrocklin, 2019), but how ASR recognizes accented speech and the nature of the feedback it provides to language learners is still largely under-researched. The current study explores whether the intelligibility of L2 speakers differs when assessed by native (L1) listeners versus ASR technology, and reports on the types of intelligibility issues encountered by the two groups. Twelve L1 listeners of English transcribed 48 isolated words targeting the /ɪ-i/ and /æ-ε/ contrasts and 24 short sentences that four Taiwanese intermediate learners of English had produced using Google’s ASR dictation system. Overall, the results revealed lower intelligibility scores for the word task (ASR: 40.81%, L1 listeners: 38.62%) than the sentence task (ASR: 75.52%, L1 listeners: 83.88%), and highlighted strong similarities in the error types – and their proportions – identified by ASR and the L1 listeners. However, despite similar recognition scores, correlations indicated that the ASR recognition of the L2 speakers’ oral productions mirrored the L1 listeners’ judgments of intelligibility in the word and sentence tasks for only one speaker, with significant positive correlations for one additional speaker in each task. This suggests that the extent to which ASR approaches L1 listeners at recognizing accented speech may depend on individual speakers and the type of oral speech.
第二语言可理解性评估:比较母语听众和自动语音识别
越来越多的研究正在探索基于自动语音识别(ASR)的听写程序对第二语言(L2)发音学习的好处(例如Chen, Inceoglu & Lim, 2020;Liakin, Cardoso & Liakina, 2015;mcrocklin, 2019),但ASR如何识别有口音的语音,以及它向语言学习者提供的反馈的性质,在很大程度上仍未得到充分研究。目前的研究探讨了母语(L1)听者与ASR技术评估的L2说话者的可理解性是否不同,并报告了两组人遇到的可理解性问题的类型。12名L1英语听写者使用谷歌的ASR听写系统,转录了4名台湾中级英语学习者所写的48个针对/ / i/和/æ-ε/对比的孤立单词和24个短句。总体而言,结果显示单词任务(ASR: 40.81%,母语听者:38.62%)的可理解性得分低于句子任务(ASR: 75.52%,母语听者:83.88%),并且突出了ASR和母语听者识别的错误类型及其比例的高度相似性。然而,尽管识别得分相似,相关性表明,第二语言说话者口头作品的ASR识别反映了第一语言听者在单词和句子任务中对一个说话者的可理解性判断,每个任务中都有一个额外的说话者显著的正相关。这表明,ASR在多大程度上接近母语听者识别重音语音,可能取决于说话者个人和口语类型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Recall
Recall Multiple-
CiteScore
8.50
自引率
4.40%
发文量
17
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信