Edge Computing Solutions Supporting Voice Recognition Services for Speakers with Dysarthria

2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing Workshops (CCGridW) Pub Date : 2023-05-01 DOI:10.1109/CCGridW59191.2023.00047

Davide Mulfari, Lorenzo Carnevale, A. Galletta, M. Villari

{"title":"Edge Computing Solutions Supporting Voice Recognition Services for Speakers with Dysarthria","authors":"Davide Mulfari, Lorenzo Carnevale, A. Galletta, M. Villari","doi":"10.1109/CCGridW59191.2023.00047","DOIUrl":null,"url":null,"abstract":"In the framework of Automatic Speech Recognition (ASR), the synergism between edge computing and artificial intelligence has led to the development of intelligent objects that process and respond to human speech. This acts as a key enabler for multiple application scenarios, such as smart home automation, where the user’s voice is an interface for interacting with appliances and computer systems. However, for millions of speakers with dysarthria worldwide, such a voice interaction is impossible because nowadays ASR technologies are not robust to their atypical speech commands. So these people, who also live with severe motor disabilities, are unable to benefit from many voice assistant services that might support their everyday life. To cope with the above challenges, this paper proposes a deep learning approach to isolated word recognition in the presence of dysarthria conditions, along with the deployment of customized ASR models on machine learning powered edge computing nodes. In this way, we work toward a low-cost, portable solution with the potential to operate next to the user with a disability, e.g., in a wheelchair or beside a bed, in an always active mode. Finally, experiments show the goodness (in terms of word error rate) of our speech recognition solution in comparison with other studies on isolated word recognition for impaired speech.","PeriodicalId":341115,"journal":{"name":"2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing Workshops (CCGridW)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing Workshops (CCGridW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCGridW59191.2023.00047","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

In the framework of Automatic Speech Recognition (ASR), the synergism between edge computing and artificial intelligence has led to the development of intelligent objects that process and respond to human speech. This acts as a key enabler for multiple application scenarios, such as smart home automation, where the user’s voice is an interface for interacting with appliances and computer systems. However, for millions of speakers with dysarthria worldwide, such a voice interaction is impossible because nowadays ASR technologies are not robust to their atypical speech commands. So these people, who also live with severe motor disabilities, are unable to benefit from many voice assistant services that might support their everyday life. To cope with the above challenges, this paper proposes a deep learning approach to isolated word recognition in the presence of dysarthria conditions, along with the deployment of customized ASR models on machine learning powered edge computing nodes. In this way, we work toward a low-cost, portable solution with the potential to operate next to the user with a disability, e.g., in a wheelchair or beside a bed, in an always active mode. Finally, experiments show the goodness (in terms of word error rate) of our speech recognition solution in comparison with other studies on isolated word recognition for impaired speech.

查看原文本刊更多论文

支持语音识别服务的边缘计算解决方案，为患有构音障碍的说话者

在自动语音识别(ASR)的框架中，边缘计算和人工智能之间的协同作用导致了处理和响应人类语音的智能对象的发展。这是多种应用场景的关键推动者，例如智能家居自动化，其中用户的声音是与设备和计算机系统交互的界面。然而，对于全世界数百万患有构音障碍的说话者来说，这样的语音交互是不可能的，因为现在的ASR技术对他们的非典型语音命令并不强大。因此，这些患有严重运动障碍的人，无法从许多可能支持他们日常生活的语音助手服务中受益。为了应对上述挑战，本文提出了一种深度学习方法来识别存在构音障碍的孤立词，并在机器学习驱动的边缘计算节点上部署定制的ASR模型。通过这种方式，我们致力于一种低成本、便携的解决方案，这种解决方案有可能在残疾人身边操作，例如，在轮椅上或床边，以一种始终活跃的模式。最后，实验证明了我们的语音识别解决方案与其他残障语音孤立词识别研究相比的优点(在单词错误率方面)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing Workshops (CCGridW)

自引率

0.00%

发文量