语音到文本技术在心理语言研究中的应用有多成熟？评估人工智能生成的英语记录誊本在分析年轻人和老年人的自由口语反应时的有效性。

IF 4.6 2区心理学 Q1 PSYCHOLOGY, EXPERIMENTAL

Behavior Research Methods Pub Date : 2024-10-01 Epub Date: 2024-05-21 DOI:10.3758/s13428-024-02440-1

Valeria A Pfeifer, Trish D Chilton, Matthew D Grilli, Matthias R Mehl

{"title":"语音到文本技术在心理语言研究中的应用有多成熟？评估人工智能生成的英语记录誊本在分析年轻人和老年人的自由口语反应时的有效性。","authors":"Valeria A Pfeifer, Trish D Chilton, Matthew D Grilli, Matthias R Mehl","doi":"10.3758/s13428-024-02440-1","DOIUrl":null,"url":null,"abstract":"For the longest time, the gold standard in preparing spoken language corpora for text analysis in psychology was using human transcription. However, such standard comes at extensive cost, and creates barriers to quantitative spoken language analysis that recent advances in speech-to-text technology could address. The current study quantifies the accuracy of AI-generated transcripts compared to human-corrected transcripts across younger (n = 100) and older (n = 92) adults and two spoken language tasks. Further, it evaluates the validity of Linguistic Inquiry and Word Count (LIWC)-features extracted from these two kinds of transcripts, as well as transcripts specifically prepared for LIWC analyses via tagging. We find that overall, AI-generated transcripts are highly accurate with a word error rate of 2.50% to 3.36%, albeit being slightly less accurate for younger compared to older adults. LIWC features extracted from either transcripts are highly correlated, while the tagging procedure significantly alters filler word categories. Based on these results, automatic speech-to-text appears to be ready for psychological language research when using spoken language tasks in relatively quiet environments, unless filler words are of interest to researchers.","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":" ","pages":"7621-7631"},"PeriodicalIF":4.6000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11365748/pdf/","citationCount":"0","resultStr":"{\"title\":\"How ready is speech-to-text for psychological language research? Evaluating the validity of AI-generated English transcripts for analyzing free-spoken responses in younger and older adults.\",\"authors\":\"Valeria A Pfeifer, Trish D Chilton, Matthew D Grilli, Matthias R Mehl\",\"doi\":\"10.3758/s13428-024-02440-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"For the longest time, the gold standard in preparing spoken language corpora for text analysis in psychology was using human transcription. However, such standard comes at extensive cost, and creates barriers to quantitative spoken language analysis that recent advances in speech-to-text technology could address. The current study quantifies the accuracy of AI-generated transcripts compared to human-corrected transcripts across younger (n = 100) and older (n = 92) adults and two spoken language tasks. Further, it evaluates the validity of Linguistic Inquiry and Word Count (LIWC)-features extracted from these two kinds of transcripts, as well as transcripts specifically prepared for LIWC analyses via tagging. We find that overall, AI-generated transcripts are highly accurate with a word error rate of 2.50% to 3.36%, albeit being slightly less accurate for younger compared to older adults. LIWC features extracted from either transcripts are highly correlated, while the tagging procedure significantly alters filler word categories. Based on these results, automatic speech-to-text appears to be ready for psychological language research when using spoken language tasks in relatively quiet environments, unless filler words are of interest to researchers.\",\"PeriodicalId\":8717,\"journal\":{\"name\":\"Behavior Research Methods\",\"volume\":\" \",\"pages\":\"7621-7631\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2024-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11365748/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Behavior Research Methods\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://doi.org/10.3758/s13428-024-02440-1\",\"RegionNum\":2,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/5/21 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"PSYCHOLOGY, EXPERIMENTAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Behavior Research Methods","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.3758/s13428-024-02440-1","RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/5/21 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"PSYCHOLOGY, EXPERIMENTAL","Score":null,"Total":0}

引用次数: 0

摘要

长期以来，为心理学文本分析准备口语语料库的黄金标准是使用人工转录。然而，这样的标准需要付出高昂的成本，并给口语定量分析造成障碍，而语音转文本技术的最新进展可以解决这一问题。本研究对人工智能生成的转录本与人工校正的转录本的准确性进行了量化对比，对比对象包括年轻人（n = 100）和老年人（n = 92），以及两项口语任务。此外，本研究还评估了从这两种记录誊本中提取的语言调查和字数（LIWC）特征的有效性，以及通过标记为 LIWC 分析专门准备的记录誊本的有效性。我们发现，总体而言，人工智能生成的记录誊本准确率很高，单词错误率在 2.50% 到 3.36% 之间，尽管年轻人的准确率略低于老年人。从这两种文本中提取的 LIWC 特征具有高度相关性，而标记过程会显著改变填充词的类别。基于这些结果，在相对安静的环境中使用口语任务时，除非研究人员对填充词感兴趣，否则自动语音转文本似乎可以用于心理语言研究。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

How ready is speech-to-text for psychological language research? Evaluating the validity of AI-generated English transcripts for analyzing free-spoken responses in younger and older adults.

For the longest time, the gold standard in preparing spoken language corpora for text analysis in psychology was using human transcription. However, such standard comes at extensive cost, and creates barriers to quantitative spoken language analysis that recent advances in speech-to-text technology could address. The current study quantifies the accuracy of AI-generated transcripts compared to human-corrected transcripts across younger (n = 100) and older (n = 92) adults and two spoken language tasks. Further, it evaluates the validity of Linguistic Inquiry and Word Count (LIWC)-features extracted from these two kinds of transcripts, as well as transcripts specifically prepared for LIWC analyses via tagging. We find that overall, AI-generated transcripts are highly accurate with a word error rate of 2.50% to 3.36%, albeit being slightly less accurate for younger compared to older adults. LIWC features extracted from either transcripts are highly correlated, while the tagging procedure significantly alters filler word categories. Based on these results, automatic speech-to-text appears to be ready for psychological language research when using spoken language tasks in relatively quiet environments, unless filler words are of interest to researchers.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Behavior Research Methods Multiple-

CiteScore

10.30

自引率

9.30%

发文量

266

期刊介绍： Behavior Research Methods publishes articles concerned with the methods, techniques, and instrumentation of research in experimental psychology. The journal focuses particularly on the use of computer technology in psychological research. An annual special issue is devoted to this field.