Design and implementation of smart voice assistant and recognizing academic words

International Robotics & Automation Journal Pub Date : 2022-02-24 DOI:10.15406/iratj.2022.08.00240

A. Abougarair, Mohamed KI Aburakhis, Mohamed O Zaroug

{"title":"Design and implementation of smart voice assistant and recognizing academic words","authors":"A. Abougarair, Mohamed KI Aburakhis, Mohamed O Zaroug","doi":"10.15406/iratj.2022.08.00240","DOIUrl":null,"url":null,"abstract":"This paper approaches the use of a Virtual Assistant using neural networks for recognition of commonly used words. The main purpose is to facilitate the users’ daily lives by sensing the voice and interpreting it into action. Alice, which is the name of the assistant, is implemented based on four main techniques: Hot word detection, Voice to Text conversion, Intent recognition, and Text to Voice conversion. Linux is the operating system of choice, for developing and running the assistant because it is in the public domain, also, Linux has been implemented on most Single-board computers. Python is chosen as a development language due to its capabilities and compatibility with various APIs and libraries, which are deemed necessary for the project. The virtual assistant will be required to communicate with IoT devices. In addition, a speech recognition system is created in order to recognize the significant technical words. An artificial neural network (ANN) with different structure networks and training algorithms is utilized in conjunction with the Mel Frequency Cepstral Coefficient (MFCC) feature extraction technique to increase the identification rate effectively and find the optimal performance. For training purposes, the Levenberg-Marquardt (LM) and BGFS Quasi-Newton Resilient Backpropagation are compared using 10 MFCC, utilizing from 10 to 50 neurons increasing in increments of 10 similarly for 13MFCC the training is done utilizing from between 10 to 50 neurons.","PeriodicalId":346234,"journal":{"name":"International Robotics & Automation Journal","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Robotics & Automation Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15406/iratj.2022.08.00240","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

Abstract

This paper approaches the use of a Virtual Assistant using neural networks for recognition of commonly used words. The main purpose is to facilitate the users’ daily lives by sensing the voice and interpreting it into action. Alice, which is the name of the assistant, is implemented based on four main techniques: Hot word detection, Voice to Text conversion, Intent recognition, and Text to Voice conversion. Linux is the operating system of choice, for developing and running the assistant because it is in the public domain, also, Linux has been implemented on most Single-board computers. Python is chosen as a development language due to its capabilities and compatibility with various APIs and libraries, which are deemed necessary for the project. The virtual assistant will be required to communicate with IoT devices. In addition, a speech recognition system is created in order to recognize the significant technical words. An artificial neural network (ANN) with different structure networks and training algorithms is utilized in conjunction with the Mel Frequency Cepstral Coefficient (MFCC) feature extraction technique to increase the identification rate effectively and find the optimal performance. For training purposes, the Levenberg-Marquardt (LM) and BGFS Quasi-Newton Resilient Backpropagation are compared using 10 MFCC, utilizing from 10 to 50 neurons increasing in increments of 10 similarly for 13MFCC the training is done utilizing from between 10 to 50 neurons.

查看原文本刊更多论文

智能语音助手与学术词识别的设计与实现

本文探讨了利用神经网络实现虚拟助手对常用词汇的识别。主要目的是通过感知声音并将其转化为行动，为用户的日常生活提供便利。Alice是这个助手的名字，它是基于四个主要技术实现的:热词检测、语音到文本转换、意图识别和文本到语音转换。Linux是首选的操作系统，用于开发和运行助手，因为它是在公共领域，而且Linux已经在大多数单板计算机上实现。选择Python作为开发语言是因为它的功能和与各种api和库的兼容性，这被认为是项目所必需的。虚拟助手将需要与物联网设备进行通信。此外，还建立了语音识别系统，以识别重要的技术词汇。利用不同结构网络和训练算法的人工神经网络(ANN)与Mel频率倒谱系数(MFCC)特征提取技术相结合，有效地提高了识别率并找到了最优性能。为了训练目的，Levenberg-Marquardt (LM)和BGFS准牛顿弹性反向传播使用10个MFCC进行比较，使用10到50个神经元，以10的增量增加。同样，对于13MFCC，使用10到50个神经元进行训练。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Robotics & Automation Journal

自引率

0.00%

发文量