使用小数据集的深度学习网络为波斯语用户设计语音控制轮椅

Transactions on machine learning research Pub Date : 2021-01-01 DOI:10.11648/j.mlr.20210601.11

Masoud Amiri, Manizheh Ranjbar, Mostafa Azami Gharetappeh

{"title":"使用小数据集的深度学习网络为波斯语用户设计语音控制轮椅","authors":"Masoud Amiri, Manizheh Ranjbar, Mostafa Azami Gharetappeh","doi":"10.11648/j.mlr.20210601.11","DOIUrl":null,"url":null,"abstract":": With the advancement of technology, the demand for improving the quality of life of the elderly and disabled has increased and their hope to overcome their problem is realized by using advanced technologies in the field of rehabilitation. Many existing electrical and electronic devices can be turned into more controllable and more functional devices using artificial intelligence. In every society, some spinal disabled people lack physical and motor abilities such as moving their limbs and they cannot use the normal wheelchair and need a wheelchair with voice control. The main challenge of this project is to identify the voice patterns of disabled people. Audio classification is one of the challenges in the field of pattern recognition. In this paper, a method of classifying ambient sounds based on the sound spectrogram, using deep neural networks is presented to classify Persian speakers sound for building a voice-controlled intelligent wheelchair. To do this, we used Inception-V3 as a convolutional neural network which is pretrained by the ImageNet dataset. In the next step, we trained the network with images that are generated using spectrogram images of the ambient sound of about 50 Persian speakers. The experimental results achieved a mean accuracy of 83.33%. In this plan, there is the ability to control the wheelchair by a third party (such as spouse, children or parents) by installing an application on their mobile phones. This wheelchair will be able to execute five commands such as stop, left, right, front and back.","PeriodicalId":75238,"journal":{"name":"Transactions on machine learning research","volume":"34 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Designing a Voice-controlled Wheelchair for Persian-speaking Users Using Deep Learning Networks with a Small Dataset\",\"authors\":\"Masoud Amiri, Manizheh Ranjbar, Mostafa Azami Gharetappeh\",\"doi\":\"10.11648/j.mlr.20210601.11\",\"DOIUrl\":null,\"url\":null,\"abstract\":\": With the advancement of technology, the demand for improving the quality of life of the elderly and disabled has increased and their hope to overcome their problem is realized by using advanced technologies in the field of rehabilitation. Many existing electrical and electronic devices can be turned into more controllable and more functional devices using artificial intelligence. In every society, some spinal disabled people lack physical and motor abilities such as moving their limbs and they cannot use the normal wheelchair and need a wheelchair with voice control. The main challenge of this project is to identify the voice patterns of disabled people. Audio classification is one of the challenges in the field of pattern recognition. In this paper, a method of classifying ambient sounds based on the sound spectrogram, using deep neural networks is presented to classify Persian speakers sound for building a voice-controlled intelligent wheelchair. To do this, we used Inception-V3 as a convolutional neural network which is pretrained by the ImageNet dataset. In the next step, we trained the network with images that are generated using spectrogram images of the ambient sound of about 50 Persian speakers. The experimental results achieved a mean accuracy of 83.33%. In this plan, there is the ability to control the wheelchair by a third party (such as spouse, children or parents) by installing an application on their mobile phones. This wheelchair will be able to execute five commands such as stop, left, right, front and back.\",\"PeriodicalId\":75238,\"journal\":{\"name\":\"Transactions on machine learning research\",\"volume\":\"34 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Transactions on machine learning research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.11648/j.mlr.20210601.11\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transactions on machine learning research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.11648/j.mlr.20210601.11","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

随着科技的进步，提高老年人和残疾人生活质量的要求越来越高，他们希望通过康复领域的先进技术来克服自己的问题。许多现有的电气和电子设备可以通过人工智能变成更可控、更多功能的设备。在每个社会中，都有一些脊柱残疾的人缺乏肢体活动等身体和运动能力，他们不能使用正常的轮椅，需要语音控制的轮椅。这个项目的主要挑战是识别残疾人的声音模式。音频分类是模式识别领域的难点之一。本文提出了一种基于声谱图的环境声分类方法，利用深度神经网络对波斯语说话人的声音进行分类，用于构建语音控制智能轮椅。为了做到这一点，我们使用Inception-V3作为卷积神经网络，它是由ImageNet数据集预训练的。在下一步，我们用大约50个波斯语说话者的环境声音的频谱图图像来训练网络。实验结果平均准确率为83.33%。在这个计划中，第三方(如配偶、子女或父母)可以通过在他们的手机上安装应用程序来控制轮椅。这款轮椅将能够执行停止、左、右、前、后等五种命令。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Designing a Voice-controlled Wheelchair for Persian-speaking Users Using Deep Learning Networks with a Small Dataset

: With the advancement of technology, the demand for improving the quality of life of the elderly and disabled has increased and their hope to overcome their problem is realized by using advanced technologies in the field of rehabilitation. Many existing electrical and electronic devices can be turned into more controllable and more functional devices using artificial intelligence. In every society, some spinal disabled people lack physical and motor abilities such as moving their limbs and they cannot use the normal wheelchair and need a wheelchair with voice control. The main challenge of this project is to identify the voice patterns of disabled people. Audio classification is one of the challenges in the field of pattern recognition. In this paper, a method of classifying ambient sounds based on the sound spectrogram, using deep neural networks is presented to classify Persian speakers sound for building a voice-controlled intelligent wheelchair. To do this, we used Inception-V3 as a convolutional neural network which is pretrained by the ImageNet dataset. In the next step, we trained the network with images that are generated using spectrogram images of the ambient sound of about 50 Persian speakers. The experimental results achieved a mean accuracy of 83.33%. In this plan, there is the ability to control the wheelchair by a third party (such as spouse, children or parents) by installing an application on their mobile phones. This wheelchair will be able to execute five commands such as stop, left, right, front and back.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Transactions on machine learning research

自引率

0.00%

发文量