使用深度学习的语音命令识别

M. Ayache, Hussien Kanaan, Kawthar Kassir, Yasser Kassir
{"title":"使用深度学习的语音命令识别","authors":"M. Ayache, Hussien Kanaan, Kawthar Kassir, Yasser Kassir","doi":"10.1109/ICABME53305.2021.9604862","DOIUrl":null,"url":null,"abstract":"Speech Recognition Software is a computer program that is trained to take the input of human speech, interpret it, and transcribe it into text. Most recently, the field has benefited from advances in deep learning and big data. The advances are evidenced not only by the surge of academic papers published in the field, but more importantly by the worldwide industry adoption of a variety of deep learning methods in designing and deploying speech recognition systems. The objective of this paper is to propose an advanced and accurate end-user software system that is able to recognize specific commands to control a robot to perform specified tasks in a hospital. This model will be based on Deep Learning since it is effective in models having huge data as for the two versions of Google TensorFlow and AIY datasets used in our model. Convolutional neural network will be used since it is able to extract features from the dataset instead of traditional methods of feature extraction, thus saving training time and reducing the complexity of the system. With addition to that, NVIDIA CUDA will be also used to train the model with GPU to decrease the training time. During training, some experiments have been done to see the effect of some parameters on the results of the system, and to make sure that the chosen parameters in our model are the best. The results indicate that the training, validation, and testing accuracies of the proposed approach were high, the training duration reached very low values due to the innovation used (CUDA Toolkit) and the commands were successfully recognized by the model. These results outcome the results of the papers that developed similar work which will be presented in the coming sections.","PeriodicalId":294393,"journal":{"name":"2021 Sixth International Conference on Advances in Biomedical Engineering (ICABME)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Speech Command Recognition Using Deep Learning\",\"authors\":\"M. Ayache, Hussien Kanaan, Kawthar Kassir, Yasser Kassir\",\"doi\":\"10.1109/ICABME53305.2021.9604862\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Speech Recognition Software is a computer program that is trained to take the input of human speech, interpret it, and transcribe it into text. Most recently, the field has benefited from advances in deep learning and big data. The advances are evidenced not only by the surge of academic papers published in the field, but more importantly by the worldwide industry adoption of a variety of deep learning methods in designing and deploying speech recognition systems. The objective of this paper is to propose an advanced and accurate end-user software system that is able to recognize specific commands to control a robot to perform specified tasks in a hospital. This model will be based on Deep Learning since it is effective in models having huge data as for the two versions of Google TensorFlow and AIY datasets used in our model. Convolutional neural network will be used since it is able to extract features from the dataset instead of traditional methods of feature extraction, thus saving training time and reducing the complexity of the system. With addition to that, NVIDIA CUDA will be also used to train the model with GPU to decrease the training time. During training, some experiments have been done to see the effect of some parameters on the results of the system, and to make sure that the chosen parameters in our model are the best. The results indicate that the training, validation, and testing accuracies of the proposed approach were high, the training duration reached very low values due to the innovation used (CUDA Toolkit) and the commands were successfully recognized by the model. These results outcome the results of the papers that developed similar work which will be presented in the coming sections.\",\"PeriodicalId\":294393,\"journal\":{\"name\":\"2021 Sixth International Conference on Advances in Biomedical Engineering (ICABME)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 Sixth International Conference on Advances in Biomedical Engineering (ICABME)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICABME53305.2021.9604862\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Sixth International Conference on Advances in Biomedical Engineering (ICABME)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICABME53305.2021.9604862","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

摘要

语音识别软件是一种计算机程序,它被训练来接受人类语音的输入,解释它,并将其转录成文本。最近,该领域受益于深度学习和大数据的进步。这些进步不仅体现在该领域发表的学术论文的激增上,更重要的是,世界范围内的行业在设计和部署语音识别系统时采用了各种深度学习方法。本文的目的是提出一种先进而准确的最终用户软件系统,该系统能够识别特定命令来控制医院中的机器人执行特定任务。这个模型将基于深度学习,因为对于我们模型中使用的两个版本的Google TensorFlow和AIY数据集来说,深度学习在拥有大量数据的模型中是有效的。将使用卷积神经网络,因为它能够从数据集中提取特征,而不是传统的特征提取方法,从而节省了训练时间,降低了系统的复杂性。除此之外,NVIDIA CUDA还将使用GPU来训练模型,以减少训练时间。在训练过程中,我们做了一些实验来观察一些参数对系统结果的影响,以确保我们的模型中选择的参数是最好的。结果表明,该方法的训练、验证和测试精度较高,训练持续时间较短,并且模型能够成功识别命令。这些结果产生了将在接下来的部分中提出的开发类似工作的论文的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Speech Command Recognition Using Deep Learning
Speech Recognition Software is a computer program that is trained to take the input of human speech, interpret it, and transcribe it into text. Most recently, the field has benefited from advances in deep learning and big data. The advances are evidenced not only by the surge of academic papers published in the field, but more importantly by the worldwide industry adoption of a variety of deep learning methods in designing and deploying speech recognition systems. The objective of this paper is to propose an advanced and accurate end-user software system that is able to recognize specific commands to control a robot to perform specified tasks in a hospital. This model will be based on Deep Learning since it is effective in models having huge data as for the two versions of Google TensorFlow and AIY datasets used in our model. Convolutional neural network will be used since it is able to extract features from the dataset instead of traditional methods of feature extraction, thus saving training time and reducing the complexity of the system. With addition to that, NVIDIA CUDA will be also used to train the model with GPU to decrease the training time. During training, some experiments have been done to see the effect of some parameters on the results of the system, and to make sure that the chosen parameters in our model are the best. The results indicate that the training, validation, and testing accuracies of the proposed approach were high, the training duration reached very low values due to the innovation used (CUDA Toolkit) and the commands were successfully recognized by the model. These results outcome the results of the papers that developed similar work which will be presented in the coming sections.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信