自动语音识别的概念验证研究,用于从高科技AAC系统中转录AAC扬声器的语音。

IF 2.5 4区 医学 Q1 REHABILITATION
Assistive Technology Pub Date : 2024-07-03 Epub Date: 2023-10-05 DOI:10.1080/10400435.2023.2260860
Szu-Han Kay Chen, Conner Saeli, Gang Hu
{"title":"自动语音识别的概念验证研究,用于从高科技AAC系统中转录AAC扬声器的语音。","authors":"Szu-Han Kay Chen, Conner Saeli, Gang Hu","doi":"10.1080/10400435.2023.2260860","DOIUrl":null,"url":null,"abstract":"<p><p>Automatic speech recognition (ASR) is an emerging technology that has been used in recognizing non-typical speech of people with speech impairment and enhancing the language sample transcription process in communication sciences and disorders. However, the feasibility of using ASR for recognizing speech samples from high-tech Augmentative and Alternative Communication (AAC) systems has not been investigated. This proof-of-concept paper aims to investigate the feasibility of using AAC-ASR to transcribe language samples generated by high-tech AAC systems and compares the recognition accuracy of two published ASR models: CMU Sphinx and Google Speech-to-text. An AAC-ASR model was developed that transcribes simulated AAC speaker language samples. The AAC-ASR model's word error rate (WER) was compared with those of CMU Sphinx and Google Speech-to-text. The WER of the AAC-ASR model outperformed (28.6%) compared with CMU Sphinx and Google when tested on the testing files (70.7% and 86.2% retrospectively). Our results demonstrate the feasibility of using the ASR model to automatically transcribe high-technology AAC-simulated language samples to support language sample analysis. Future steps will focus on developing the model with diverse AAC speech training datasets and understanding the speech patterns of individual AAC users to refine the AAC-ASR model.</p>","PeriodicalId":51568,"journal":{"name":"Assistive Technology","volume":null,"pages":null},"PeriodicalIF":2.5000,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A proof-of-concept study for automatic speech recognition to transcribe AAC speakers' speech from high-technology AAC systems.\",\"authors\":\"Szu-Han Kay Chen, Conner Saeli, Gang Hu\",\"doi\":\"10.1080/10400435.2023.2260860\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Automatic speech recognition (ASR) is an emerging technology that has been used in recognizing non-typical speech of people with speech impairment and enhancing the language sample transcription process in communication sciences and disorders. However, the feasibility of using ASR for recognizing speech samples from high-tech Augmentative and Alternative Communication (AAC) systems has not been investigated. This proof-of-concept paper aims to investigate the feasibility of using AAC-ASR to transcribe language samples generated by high-tech AAC systems and compares the recognition accuracy of two published ASR models: CMU Sphinx and Google Speech-to-text. An AAC-ASR model was developed that transcribes simulated AAC speaker language samples. The AAC-ASR model's word error rate (WER) was compared with those of CMU Sphinx and Google Speech-to-text. The WER of the AAC-ASR model outperformed (28.6%) compared with CMU Sphinx and Google when tested on the testing files (70.7% and 86.2% retrospectively). Our results demonstrate the feasibility of using the ASR model to automatically transcribe high-technology AAC-simulated language samples to support language sample analysis. Future steps will focus on developing the model with diverse AAC speech training datasets and understanding the speech patterns of individual AAC users to refine the AAC-ASR model.</p>\",\"PeriodicalId\":51568,\"journal\":{\"name\":\"Assistive Technology\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2024-07-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Assistive Technology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1080/10400435.2023.2260860\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/10/5 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"REHABILITATION\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Assistive Technology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1080/10400435.2023.2260860","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/10/5 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"REHABILITATION","Score":null,"Total":0}
引用次数: 0

摘要

自动语音识别(ASR)是一种新兴技术,已被用于识别言语障碍者的非典型语音,并在通信科学和疾病中增强语言样本转录过程。然而,使用ASR来识别来自高科技增强和替代通信(AAC)系统的语音样本的可行性尚未得到研究。这篇概念验证论文旨在研究使用AAC-ASR转录高科技AAC系统生成的语言样本的可行性,并比较两个已发表的ASR模型:CMU Sphinx和Google Speech到文本的识别精度。开发了AAC-ASR模型,用于转录模拟AAC说话者语言样本。将AAC-ASR模型的单词错误率(WER)与CMU Sphinx和Google Speech-to-text模型的单词出错率进行了比较。在测试文件上进行测试时,AAC-ASR模型的WER优于CMU Sphinx和Google(回顾性地分别为70.7%和86.2%)(28.6%)。我们的结果证明了使用ASR模型自动转录高科技AAC模拟语言样本以支持语言样本分析的可行性。未来的步骤将侧重于开发具有不同AAC语音训练数据集的模型,并了解各个AAC用户的语音模式,以完善AAC-ASR模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A proof-of-concept study for automatic speech recognition to transcribe AAC speakers' speech from high-technology AAC systems.

Automatic speech recognition (ASR) is an emerging technology that has been used in recognizing non-typical speech of people with speech impairment and enhancing the language sample transcription process in communication sciences and disorders. However, the feasibility of using ASR for recognizing speech samples from high-tech Augmentative and Alternative Communication (AAC) systems has not been investigated. This proof-of-concept paper aims to investigate the feasibility of using AAC-ASR to transcribe language samples generated by high-tech AAC systems and compares the recognition accuracy of two published ASR models: CMU Sphinx and Google Speech-to-text. An AAC-ASR model was developed that transcribes simulated AAC speaker language samples. The AAC-ASR model's word error rate (WER) was compared with those of CMU Sphinx and Google Speech-to-text. The WER of the AAC-ASR model outperformed (28.6%) compared with CMU Sphinx and Google when tested on the testing files (70.7% and 86.2% retrospectively). Our results demonstrate the feasibility of using the ASR model to automatically transcribe high-technology AAC-simulated language samples to support language sample analysis. Future steps will focus on developing the model with diverse AAC speech training datasets and understanding the speech patterns of individual AAC users to refine the AAC-ASR model.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Assistive Technology
Assistive Technology REHABILITATION-
CiteScore
4.00
自引率
5.60%
发文量
40
期刊介绍: Assistive Technology is an applied, scientific publication in the multi-disciplinary field of technology for people with disabilities. The journal"s purpose is to foster communication among individuals working in all aspects of the assistive technology arena including researchers, developers, clinicians, educators and consumers. The journal will consider papers from all assistive technology applications. Only original papers will be accepted. Technical notes describing preliminary techniques, procedures, or findings of original scientific research may also be submitted. Letters to the Editor are welcome. Books for review may be sent to authors or publisher.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信