Voice Command Recognition for Drone Control by Deep Neural Networks on Embedded System

2021 8th International Conference on Electrical and Electronics Engineering (ICEEE) Pub Date : 2021-04-09 DOI:10.1109/ICEEE52452.2021.9415964

Cengizhan Yapıcıoğlu, Z. Dokur, T. Ölmez

{"title":"Voice Command Recognition for Drone Control by Deep Neural Networks on Embedded System","authors":"Cengizhan Yapıcıoğlu, Z. Dokur, T. Ölmez","doi":"10.1109/ICEEE52452.2021.9415964","DOIUrl":null,"url":null,"abstract":"Speech recognition and its applications for controlling a system has been an important and attractive issue over the last few decades. Controlling electronic devices by speech commands allows us to manage systems quickly and easily since users would not need any additional information or remote controller. Being able to communicate a system by using speech commands also brings with the requirements of fast and accurate response. So, at the present, speech recognition algorithms are extensively performing on high performance computers. However, the improvements of system on a chip (SoC) board and deep neural network based algorithms, make it possible to execute such kind of programs on them. The proposed study defines a model for controlling a drone system by using Turkish speech directional commands in real time which is based on deep learning approaches using spectrogram images. At first, speech commands are detected in real time with the help of signal energy and zero crossing rate and these are converted to log spectrogram images. A CNN (three convolutional layers and a fully connected layer) structure is created and trained by feeding those images. Then, the trained model is moved to embedded board to achieve real time, low-cost performance. Speech commands are provided by the user instantaneously and transferred to the model as an input for decision. Then, algorithm decides which directional command is given by the user and desired operation is performed on the drone system. It is observed that, by using the proposed model, accuracies of 95.72% for offline dataset and 92,88% for real time classification are obtained.","PeriodicalId":429645,"journal":{"name":"2021 8th International Conference on Electrical and Electronics Engineering (ICEEE)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 8th International Conference on Electrical and Electronics Engineering (ICEEE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEEE52452.2021.9415964","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Speech recognition and its applications for controlling a system has been an important and attractive issue over the last few decades. Controlling electronic devices by speech commands allows us to manage systems quickly and easily since users would not need any additional information or remote controller. Being able to communicate a system by using speech commands also brings with the requirements of fast and accurate response. So, at the present, speech recognition algorithms are extensively performing on high performance computers. However, the improvements of system on a chip (SoC) board and deep neural network based algorithms, make it possible to execute such kind of programs on them. The proposed study defines a model for controlling a drone system by using Turkish speech directional commands in real time which is based on deep learning approaches using spectrogram images. At first, speech commands are detected in real time with the help of signal energy and zero crossing rate and these are converted to log spectrogram images. A CNN (three convolutional layers and a fully connected layer) structure is created and trained by feeding those images. Then, the trained model is moved to embedded board to achieve real time, low-cost performance. Speech commands are provided by the user instantaneously and transferred to the model as an input for decision. Then, algorithm decides which directional command is given by the user and desired operation is performed on the drone system. It is observed that, by using the proposed model, accuracies of 95.72% for offline dataset and 92,88% for real time classification are obtained.

查看原文本刊更多论文

基于嵌入式系统的深度神经网络无人机控制语音命令识别

在过去的几十年里，语音识别及其控制系统的应用一直是一个重要而有吸引力的问题。通过语音命令控制电子设备使我们能够快速轻松地管理系统，因为用户不需要任何额外的信息或遥控器。能够使用语音命令与系统进行通信也带来了快速准确响应的要求。因此，目前语音识别算法在高性能计算机上得到了广泛的应用。然而，随着片上系统(SoC)板的改进和基于深度神经网络的算法，使得在其上执行此类程序成为可能。拟议的研究定义了一个模型，通过使用土耳其语语音定向命令实时控制无人机系统，该模型基于使用频谱图图像的深度学习方法。首先利用信号能量和过零率对语音命令进行实时检测，并将其转换为对数谱图图像;通过输入这些图像来创建和训练CNN(三个卷积层和一个完全连接层)结构。然后，将训练好的模型移动到嵌入式板上，以实现实时、低成本的性能。语音命令由用户即时提供，并作为决策输入传递给模型。然后，算法决定用户给出哪个方向指令，并在无人机系统上执行所需的操作。结果表明，该模型对离线数据集的分类准确率为95.72%，对实时数据集的分类准确率为92.88%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 8th International Conference on Electrical and Electronics Engineering (ICEEE)

自引率

0.00%

发文量