Valeria A Pfeifer, Trish D Chilton, Matthew D Grilli, Matthias R Mehl
{"title":"语音到文本技术在心理语言研究中的应用有多成熟?评估人工智能生成的英语记录誊本在分析年轻人和老年人的自由口语反应时的有效性。","authors":"Valeria A Pfeifer, Trish D Chilton, Matthew D Grilli, Matthias R Mehl","doi":"10.3758/s13428-024-02440-1","DOIUrl":null,"url":null,"abstract":"<p><p>For the longest time, the gold standard in preparing spoken language corpora for text analysis in psychology was using human transcription. However, such standard comes at extensive cost, and creates barriers to quantitative spoken language analysis that recent advances in speech-to-text technology could address. The current study quantifies the accuracy of AI-generated transcripts compared to human-corrected transcripts across younger (n = 100) and older (n = 92) adults and two spoken language tasks. Further, it evaluates the validity of Linguistic Inquiry and Word Count (LIWC)-features extracted from these two kinds of transcripts, as well as transcripts specifically prepared for LIWC analyses via tagging. We find that overall, AI-generated transcripts are highly accurate with a word error rate of 2.50% to 3.36%, albeit being slightly less accurate for younger compared to older adults. LIWC features extracted from either transcripts are highly correlated, while the tagging procedure significantly alters filler word categories. Based on these results, automatic speech-to-text appears to be ready for psychological language research when using spoken language tasks in relatively quiet environments, unless filler words are of interest to researchers.</p>","PeriodicalId":4,"journal":{"name":"ACS Applied Energy Materials","volume":null,"pages":null},"PeriodicalIF":5.4000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11365748/pdf/","citationCount":"0","resultStr":"{\"title\":\"How ready is speech-to-text for psychological language research? Evaluating the validity of AI-generated English transcripts for analyzing free-spoken responses in younger and older adults.\",\"authors\":\"Valeria A Pfeifer, Trish D Chilton, Matthew D Grilli, Matthias R Mehl\",\"doi\":\"10.3758/s13428-024-02440-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>For the longest time, the gold standard in preparing spoken language corpora for text analysis in psychology was using human transcription. However, such standard comes at extensive cost, and creates barriers to quantitative spoken language analysis that recent advances in speech-to-text technology could address. The current study quantifies the accuracy of AI-generated transcripts compared to human-corrected transcripts across younger (n = 100) and older (n = 92) adults and two spoken language tasks. Further, it evaluates the validity of Linguistic Inquiry and Word Count (LIWC)-features extracted from these two kinds of transcripts, as well as transcripts specifically prepared for LIWC analyses via tagging. We find that overall, AI-generated transcripts are highly accurate with a word error rate of 2.50% to 3.36%, albeit being slightly less accurate for younger compared to older adults. LIWC features extracted from either transcripts are highly correlated, while the tagging procedure significantly alters filler word categories. Based on these results, automatic speech-to-text appears to be ready for psychological language research when using spoken language tasks in relatively quiet environments, unless filler words are of interest to researchers.</p>\",\"PeriodicalId\":4,\"journal\":{\"name\":\"ACS Applied Energy Materials\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":5.4000,\"publicationDate\":\"2024-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11365748/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS Applied Energy Materials\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://doi.org/10.3758/s13428-024-02440-1\",\"RegionNum\":3,\"RegionCategory\":\"材料科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/5/21 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, PHYSICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Energy Materials","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.3758/s13428-024-02440-1","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/5/21 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
How ready is speech-to-text for psychological language research? Evaluating the validity of AI-generated English transcripts for analyzing free-spoken responses in younger and older adults.
For the longest time, the gold standard in preparing spoken language corpora for text analysis in psychology was using human transcription. However, such standard comes at extensive cost, and creates barriers to quantitative spoken language analysis that recent advances in speech-to-text technology could address. The current study quantifies the accuracy of AI-generated transcripts compared to human-corrected transcripts across younger (n = 100) and older (n = 92) adults and two spoken language tasks. Further, it evaluates the validity of Linguistic Inquiry and Word Count (LIWC)-features extracted from these two kinds of transcripts, as well as transcripts specifically prepared for LIWC analyses via tagging. We find that overall, AI-generated transcripts are highly accurate with a word error rate of 2.50% to 3.36%, albeit being slightly less accurate for younger compared to older adults. LIWC features extracted from either transcripts are highly correlated, while the tagging procedure significantly alters filler word categories. Based on these results, automatic speech-to-text appears to be ready for psychological language research when using spoken language tasks in relatively quiet environments, unless filler words are of interest to researchers.
期刊介绍:
ACS Applied Energy Materials is an interdisciplinary journal publishing original research covering all aspects of materials, engineering, chemistry, physics and biology relevant to energy conversion and storage. The journal is devoted to reports of new and original experimental and theoretical research of an applied nature that integrate knowledge in the areas of materials, engineering, physics, bioscience, and chemistry into important energy applications.