后处理自动转录与机器学习的口头流畅性评分

IF 2.4 3区计算机科学 Q2 ACOUSTICS

Speech Communication Pub Date : 2023-09-27 DOI:10.1016/j.specom.2023.102990

Justin Bushnell , Frederick Unverzagt , Virginia G. Wadley , Richard Kennedy , John Del Gaizo , David Glenn Clark

{"title":"后处理自动转录与机器学习的口头流畅性评分","authors":"Justin Bushnell , Frederick Unverzagt , Virginia G. Wadley , Richard Kennedy , John Del Gaizo , David Glenn Clark","doi":"10.1016/j.specom.2023.102990","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><p>To compare verbal fluency scores derived from manual transcriptions to those obtained using automatic speech recognition enhanced with machine learning classifiers.</p></div><div><h3>Methods</h3><p>Using Amazon Web Services, we automatically transcribed verbal fluency recordings from 1400 individuals who performed both animal and letter F verbal fluency tasks. We manually adjusted timings and contents of the automatic transcriptions to obtain “gold standard” transcriptions. To make automatic scoring possible, we trained machine learning classifiers to discern between valid and invalid utterances. We then calculated and compared verbal fluency scores from the manual and automatic transcriptions.</p></div><div><h3>Results</h3><p>For both animal and letter fluency tasks, we achieved good separation of valid versus invalid utterances. Verbal fluency scores calculated based on automatic transcriptions showed high correlation with those calculated after manual correction.</p></div><div><h3>Conclusion</h3><p>Many techniques for scoring verbal fluency word lists require accurate transcriptions with word timings. We show that machine learning methods can be applied to improve off-the-shelf ASR for this purpose. These automatically derived scores may be satisfactory for some applications. Low correlations among some of the scores indicate the need for improvement in automatic speech recognition before a fully automatic approach can be reliably implemented.</p></div>","PeriodicalId":49485,"journal":{"name":"Speech Communication","volume":"155 ","pages":"Article 102990"},"PeriodicalIF":2.4000,"publicationDate":"2023-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Post-processing automatic transcriptions with machine learning for verbal fluency scoring\",\"authors\":\"Justin Bushnell , Frederick Unverzagt , Virginia G. Wadley , Richard Kennedy , John Del Gaizo , David Glenn Clark\",\"doi\":\"10.1016/j.specom.2023.102990\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Objective</h3><p>To compare verbal fluency scores derived from manual transcriptions to those obtained using automatic speech recognition enhanced with machine learning classifiers.</p></div><div><h3>Methods</h3><p>Using Amazon Web Services, we automatically transcribed verbal fluency recordings from 1400 individuals who performed both animal and letter F verbal fluency tasks. We manually adjusted timings and contents of the automatic transcriptions to obtain “gold standard” transcriptions. To make automatic scoring possible, we trained machine learning classifiers to discern between valid and invalid utterances. We then calculated and compared verbal fluency scores from the manual and automatic transcriptions.</p></div><div><h3>Results</h3><p>For both animal and letter fluency tasks, we achieved good separation of valid versus invalid utterances. Verbal fluency scores calculated based on automatic transcriptions showed high correlation with those calculated after manual correction.</p></div><div><h3>Conclusion</h3><p>Many techniques for scoring verbal fluency word lists require accurate transcriptions with word timings. We show that machine learning methods can be applied to improve off-the-shelf ASR for this purpose. These automatically derived scores may be satisfactory for some applications. Low correlations among some of the scores indicate the need for improvement in automatic speech recognition before a fully automatic approach can be reliably implemented.</p></div>\",\"PeriodicalId\":49485,\"journal\":{\"name\":\"Speech Communication\",\"volume\":\"155 \",\"pages\":\"Article 102990\"},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2023-09-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Speech Communication\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167639323001243\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ACOUSTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Speech Communication","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167639323001243","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ACOUSTICS","Score":null,"Total":0}

引用次数: 0

摘要

目的比较人工转录和机器学习分类器增强的自动语音识别的语言流利度得分。方法使用亚马逊网络服务，我们自动转录了1400名同时执行动物和字母F语言流利性任务的人的语言流利性记录。我们手动调整了自动转录的时间和内容，以获得“金标准”转录。为了使自动评分成为可能，我们训练机器学习分类器来区分有效和无效的话语。然后，我们计算并比较了手动和自动转录的语言流利度分数。结果在动物和字母流利性任务中，我们都能很好地分离出有效和无效的话语。基于自动转录计算的语言流利度分数与手动更正后计算的分数具有高度相关性。结论许多语言流利度单词表评分技术都需要准确的转录和单词计时。我们表明，机器学习方法可以用于改进现成的ASR。这些自动导出的分数对于某些应用来说可能是令人满意的。一些分数之间的低相关性表明，在可以可靠地实现全自动方法之前，需要改进自动语音识别。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Post-processing automatic transcriptions with machine learning for verbal fluency scoring

Objective

To compare verbal fluency scores derived from manual transcriptions to those obtained using automatic speech recognition enhanced with machine learning classifiers.

Methods

Using Amazon Web Services, we automatically transcribed verbal fluency recordings from 1400 individuals who performed both animal and letter F verbal fluency tasks. We manually adjusted timings and contents of the automatic transcriptions to obtain “gold standard” transcriptions. To make automatic scoring possible, we trained machine learning classifiers to discern between valid and invalid utterances. We then calculated and compared verbal fluency scores from the manual and automatic transcriptions.

Results

For both animal and letter fluency tasks, we achieved good separation of valid versus invalid utterances. Verbal fluency scores calculated based on automatic transcriptions showed high correlation with those calculated after manual correction.

Conclusion

Many techniques for scoring verbal fluency word lists require accurate transcriptions with word timings. We show that machine learning methods can be applied to improve off-the-shelf ASR for this purpose. These automatically derived scores may be satisfactory for some applications. Low correlations among some of the scores indicate the need for improvement in automatic speech recognition before a fully automatic approach can be reliably implemented.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Speech Communication 工程技术-计算机：跨学科应用

CiteScore

6.80

自引率

6.20%

发文量

审稿时长

19.2 weeks

期刊介绍： Speech Communication is an interdisciplinary journal whose primary objective is to fulfil the need for the rapid dissemination and thorough discussion of basic and applied research results. The journal''s primary objectives are: • to present a forum for the advancement of human and human-machine speech communication science; • to stimulate cross-fertilization between different fields of this domain; • to contribute towards the rapid and wide diffusion of scientifically sound contributions in this domain.