降低噪音论网络调查中语音录音的自动语音识别性能

Katharina Meitinger, Sabien van der Sluis, Matthias Schonlau
{"title":"降低噪音论网络调查中语音录音的自动语音识别性能","authors":"Katharina Meitinger, Sabien van der Sluis, Matthias Schonlau","doi":"10.29115/sp-2023-0022","DOIUrl":null,"url":null,"abstract":"Voice-recordings are increasingly implemented in web surveys, but the resulting audio data need to be transcribed before analysis. Since manual coding is too time- and work-intensive, researchers often rely on automatic speech recognition (ASR) systems for the transcription of the voice-recordings. However, ASR tools might create partly incorrect transcriptions and potentially change the content of responses. If the ASR performance (i.e., accuracy and validity) differs by subgroup and contextual factors, a bias is introduced in the analysis of open-ended questions. We assessed the impact of sociodemographic and contextual factors on the accuracy and validity of ASR transcriptions with data from the Longitudinal Internet Studies for the Social Sciences (LISS) panel collected in December 2020. We find that background noise reduces the accuracy and validity of ASR transcriptions. In addition, validity improved when the respondent was alone during the survey. Fortunately, we did not find any evidence of systematic differences across subgroups (age, sex, education), devices or respondent location.","PeriodicalId":74893,"journal":{"name":"Survey practice","volume":"7 6","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-02-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Keep the noise down: On the performance of automatic speech recognition of voice-recordings in web surveys\",\"authors\":\"Katharina Meitinger, Sabien van der Sluis, Matthias Schonlau\",\"doi\":\"10.29115/sp-2023-0022\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Voice-recordings are increasingly implemented in web surveys, but the resulting audio data need to be transcribed before analysis. Since manual coding is too time- and work-intensive, researchers often rely on automatic speech recognition (ASR) systems for the transcription of the voice-recordings. However, ASR tools might create partly incorrect transcriptions and potentially change the content of responses. If the ASR performance (i.e., accuracy and validity) differs by subgroup and contextual factors, a bias is introduced in the analysis of open-ended questions. We assessed the impact of sociodemographic and contextual factors on the accuracy and validity of ASR transcriptions with data from the Longitudinal Internet Studies for the Social Sciences (LISS) panel collected in December 2020. We find that background noise reduces the accuracy and validity of ASR transcriptions. In addition, validity improved when the respondent was alone during the survey. Fortunately, we did not find any evidence of systematic differences across subgroups (age, sex, education), devices or respondent location.\",\"PeriodicalId\":74893,\"journal\":{\"name\":\"Survey practice\",\"volume\":\"7 6\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-02-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Survey practice\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.29115/sp-2023-0022\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Survey practice","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.29115/sp-2023-0022","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

网络调查中越来越多地使用语音记录,但由此产生的音频数据需要在分析前进行转录。由于人工编码耗时耗力,研究人员通常依赖自动语音识别(ASR)系统来转录语音记录。但是,ASR 工具可能会产生部分错误的转录,并有可能改变回答的内容。如果 ASR 的性能(即准确性和有效性)因亚群体和背景因素而异,那么在分析开放式问题时就会出现偏差。我们利用 2020 年 12 月收集的社会科学纵向互联网研究(LISS)小组数据,评估了社会人口和背景因素对 ASR 转录准确性和有效性的影响。我们发现,背景噪声会降低 ASR 转录的准确性和有效性。此外,当受访者在调查期间独自一人时,有效性也会提高。幸运的是,我们没有发现任何证据表明不同亚组(年龄、性别、教育程度)、设备或受访者所在地之间存在系统性差异。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Keep the noise down: On the performance of automatic speech recognition of voice-recordings in web surveys
Voice-recordings are increasingly implemented in web surveys, but the resulting audio data need to be transcribed before analysis. Since manual coding is too time- and work-intensive, researchers often rely on automatic speech recognition (ASR) systems for the transcription of the voice-recordings. However, ASR tools might create partly incorrect transcriptions and potentially change the content of responses. If the ASR performance (i.e., accuracy and validity) differs by subgroup and contextual factors, a bias is introduced in the analysis of open-ended questions. We assessed the impact of sociodemographic and contextual factors on the accuracy and validity of ASR transcriptions with data from the Longitudinal Internet Studies for the Social Sciences (LISS) panel collected in December 2020. We find that background noise reduces the accuracy and validity of ASR transcriptions. In addition, validity improved when the respondent was alone during the survey. Fortunately, we did not find any evidence of systematic differences across subgroups (age, sex, education), devices or respondent location.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信