Voice Command Recognition for Drone Control by Deep Neural Networks on Embedded System

Cengizhan Yapıcıoğlu, Z. Dokur, T. Ölmez
{"title":"Voice Command Recognition for Drone Control by Deep Neural Networks on Embedded System","authors":"Cengizhan Yapıcıoğlu, Z. Dokur, T. Ölmez","doi":"10.1109/ICEEE52452.2021.9415964","DOIUrl":null,"url":null,"abstract":"Speech recognition and its applications for controlling a system has been an important and attractive issue over the last few decades. Controlling electronic devices by speech commands allows us to manage systems quickly and easily since users would not need any additional information or remote controller. Being able to communicate a system by using speech commands also brings with the requirements of fast and accurate response. So, at the present, speech recognition algorithms are extensively performing on high performance computers. However, the improvements of system on a chip (SoC) board and deep neural network based algorithms, make it possible to execute such kind of programs on them. The proposed study defines a model for controlling a drone system by using Turkish speech directional commands in real time which is based on deep learning approaches using spectrogram images. At first, speech commands are detected in real time with the help of signal energy and zero crossing rate and these are converted to log spectrogram images. A CNN (three convolutional layers and a fully connected layer) structure is created and trained by feeding those images. Then, the trained model is moved to embedded board to achieve real time, low-cost performance. Speech commands are provided by the user instantaneously and transferred to the model as an input for decision. Then, algorithm decides which directional command is given by the user and desired operation is performed on the drone system. It is observed that, by using the proposed model, accuracies of 95.72% for offline dataset and 92,88% for real time classification are obtained.","PeriodicalId":429645,"journal":{"name":"2021 8th International Conference on Electrical and Electronics Engineering (ICEEE)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 8th International Conference on Electrical and Electronics Engineering (ICEEE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEEE52452.2021.9415964","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Speech recognition and its applications for controlling a system has been an important and attractive issue over the last few decades. Controlling electronic devices by speech commands allows us to manage systems quickly and easily since users would not need any additional information or remote controller. Being able to communicate a system by using speech commands also brings with the requirements of fast and accurate response. So, at the present, speech recognition algorithms are extensively performing on high performance computers. However, the improvements of system on a chip (SoC) board and deep neural network based algorithms, make it possible to execute such kind of programs on them. The proposed study defines a model for controlling a drone system by using Turkish speech directional commands in real time which is based on deep learning approaches using spectrogram images. At first, speech commands are detected in real time with the help of signal energy and zero crossing rate and these are converted to log spectrogram images. A CNN (three convolutional layers and a fully connected layer) structure is created and trained by feeding those images. Then, the trained model is moved to embedded board to achieve real time, low-cost performance. Speech commands are provided by the user instantaneously and transferred to the model as an input for decision. Then, algorithm decides which directional command is given by the user and desired operation is performed on the drone system. It is observed that, by using the proposed model, accuracies of 95.72% for offline dataset and 92,88% for real time classification are obtained.
基于嵌入式系统的深度神经网络无人机控制语音命令识别
在过去的几十年里,语音识别及其控制系统的应用一直是一个重要而有吸引力的问题。通过语音命令控制电子设备使我们能够快速轻松地管理系统,因为用户不需要任何额外的信息或遥控器。能够使用语音命令与系统进行通信也带来了快速准确响应的要求。因此,目前语音识别算法在高性能计算机上得到了广泛的应用。然而,随着片上系统(SoC)板的改进和基于深度神经网络的算法,使得在其上执行此类程序成为可能。拟议的研究定义了一个模型,通过使用土耳其语语音定向命令实时控制无人机系统,该模型基于使用频谱图图像的深度学习方法。首先利用信号能量和过零率对语音命令进行实时检测,并将其转换为对数谱图图像;通过输入这些图像来创建和训练CNN(三个卷积层和一个完全连接层)结构。然后,将训练好的模型移动到嵌入式板上,以实现实时、低成本的性能。语音命令由用户即时提供,并作为决策输入传递给模型。然后,算法决定用户给出哪个方向指令,并在无人机系统上执行所需的操作。结果表明,该模型对离线数据集的分类准确率为95.72%,对实时数据集的分类准确率为92.88%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信