非裔美国英语儿童自动语音识别不准确的人口学和声学因素。

Brittany N Fletcher, Wei-Wen Hsu, Vesna D Novak, Mary E Wilkens, Amy W Hobek, Amy S Pratt, Michelle Leon, Kimmerly Harrell, Victoria S McKenna
{"title":"非裔美国英语儿童自动语音识别不准确的人口学和声学因素。","authors":"Brittany N Fletcher, Wei-Wen Hsu, Vesna D Novak, Mary E Wilkens, Amy W Hobek, Amy S Pratt, Michelle Leon, Kimmerly Harrell, Victoria S McKenna","doi":"10.1044/2025_persp-25-00052","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>This study investigated the relationship between acoustic measures and Google's Speech-to-Text inaccuracies in recognizing speech of children ages 4-9 years who speak African American English (AAE).</p><p><strong>Methods: </strong>Audio recordings were collected from 11 AAE speaking children with speech stimuli targeting final plosive variations observed within the AAE dialect. Dialectal density was measured using the Diagnostic Evaluation of Language Variation Language Screener. Recordings were transcribed using Google's Speech-to-Text application (Google Voice) and inaccuracies were determined through comparison to researcher extracted transcriptions. Acoustic measures from vowels preceding final plosives (including vowel duration, fundamental frequency, average <i>F</i> <sub>1</sub>) were extracted using Praat and a custom MATLAB algorithm. Individual mixed-effects logistic regression models were conducted to analyze the relationships between acoustic measures and transcription accuracy (accurate vs. inaccurate) for voiced and voiceless plosives separately.</p><p><strong>Results: </strong>There were no significant differences between inaccuracy rates for voiced and voiceless plosive productions, nor were acoustic measures predictive of speech-to-text inaccuracy. However, age and dialect density were significantly related to voiceless plosive accuracy.</p><p><strong>Conclusions: </strong>The complexities of voice, motor and articulatory development within children can be characterized by acoustic measures. These measures inform acoustic algorithms created for speech technology. Research on acoustic measures in young child AAE speech, with considerations for dialect variability and age, will enhance speech recognition technology and clinical best practices.</p>","PeriodicalId":74424,"journal":{"name":"Perspectives of the ASHA special interest groups","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12490741/pdf/","citationCount":"0","resultStr":"{\"title\":\"Demographic and Acoustic Factors related to Automatic Speech Recognition Inaccuracies for Child African American English Speakers.\",\"authors\":\"Brittany N Fletcher, Wei-Wen Hsu, Vesna D Novak, Mary E Wilkens, Amy W Hobek, Amy S Pratt, Michelle Leon, Kimmerly Harrell, Victoria S McKenna\",\"doi\":\"10.1044/2025_persp-25-00052\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>This study investigated the relationship between acoustic measures and Google's Speech-to-Text inaccuracies in recognizing speech of children ages 4-9 years who speak African American English (AAE).</p><p><strong>Methods: </strong>Audio recordings were collected from 11 AAE speaking children with speech stimuli targeting final plosive variations observed within the AAE dialect. Dialectal density was measured using the Diagnostic Evaluation of Language Variation Language Screener. Recordings were transcribed using Google's Speech-to-Text application (Google Voice) and inaccuracies were determined through comparison to researcher extracted transcriptions. Acoustic measures from vowels preceding final plosives (including vowel duration, fundamental frequency, average <i>F</i> <sub>1</sub>) were extracted using Praat and a custom MATLAB algorithm. Individual mixed-effects logistic regression models were conducted to analyze the relationships between acoustic measures and transcription accuracy (accurate vs. inaccurate) for voiced and voiceless plosives separately.</p><p><strong>Results: </strong>There were no significant differences between inaccuracy rates for voiced and voiceless plosive productions, nor were acoustic measures predictive of speech-to-text inaccuracy. However, age and dialect density were significantly related to voiceless plosive accuracy.</p><p><strong>Conclusions: </strong>The complexities of voice, motor and articulatory development within children can be characterized by acoustic measures. These measures inform acoustic algorithms created for speech technology. Research on acoustic measures in young child AAE speech, with considerations for dialect variability and age, will enhance speech recognition technology and clinical best practices.</p>\",\"PeriodicalId\":74424,\"journal\":{\"name\":\"Perspectives of the ASHA special interest groups\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-09-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12490741/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Perspectives of the ASHA special interest groups\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1044/2025_persp-25-00052\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Perspectives of the ASHA special interest groups","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1044/2025_persp-25-00052","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

目的:研究4-9岁非裔美国英语(AAE)儿童语音识别中,声学测量与b谷歌语音转文本错误的关系。方法:收集11例AAE患儿的录音资料,并以AAE方言中观察到的最后爆破变化为目标进行言语刺激。使用语言变异诊断评价语言筛选器测量方言密度。录音使用谷歌的语音转文本应用程序(谷歌Voice)进行转录,并通过与研究人员提取的转录进行比较来确定不准确性。使用Praat和自定义MATLAB算法提取最终爆破前元音的声学测量(包括元音持续时间、基频、平均f1)。采用单独的混合效应逻辑回归模型,分别分析发声和不发声爆破音的声学测量与转录精度(准确vs不准确)之间的关系。结果:浊音和浊音爆破产品的不准确率没有显著差异,声学测量也不能预测语音到文本的不准确性。然而,年龄和方言密度对无音爆破的准确性有显著影响。结论:儿童声音、运动和发音发育的复杂性可以通过声学测量来表征。这些措施为语音技术创建的声学算法提供了信息。研究幼儿AAE语音的声学测量,考虑方言差异和年龄,将提高语音识别技术和临床最佳实践。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Demographic and Acoustic Factors related to Automatic Speech Recognition Inaccuracies for Child African American English Speakers.

Purpose: This study investigated the relationship between acoustic measures and Google's Speech-to-Text inaccuracies in recognizing speech of children ages 4-9 years who speak African American English (AAE).

Methods: Audio recordings were collected from 11 AAE speaking children with speech stimuli targeting final plosive variations observed within the AAE dialect. Dialectal density was measured using the Diagnostic Evaluation of Language Variation Language Screener. Recordings were transcribed using Google's Speech-to-Text application (Google Voice) and inaccuracies were determined through comparison to researcher extracted transcriptions. Acoustic measures from vowels preceding final plosives (including vowel duration, fundamental frequency, average F 1) were extracted using Praat and a custom MATLAB algorithm. Individual mixed-effects logistic regression models were conducted to analyze the relationships between acoustic measures and transcription accuracy (accurate vs. inaccurate) for voiced and voiceless plosives separately.

Results: There were no significant differences between inaccuracy rates for voiced and voiceless plosive productions, nor were acoustic measures predictive of speech-to-text inaccuracy. However, age and dialect density were significantly related to voiceless plosive accuracy.

Conclusions: The complexities of voice, motor and articulatory development within children can be characterized by acoustic measures. These measures inform acoustic algorithms created for speech technology. Research on acoustic measures in young child AAE speech, with considerations for dialect variability and age, will enhance speech recognition technology and clinical best practices.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信